gnuastro (0.22)
This is gnuastro.info, produced by makeinfo version 7.1 from
gnuastro.texi.
This book documents version 0.22 of the GNU Astronomy Utilities
(Gnuastro). Gnuastro provides various programs and libraries for
astronomical data manipulation and analysis.
Copyright © 2015-2024 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,
Version 1.3 or any later version published by the Free Software
Foundation; with no Invariant Sections, no Front-Cover Texts, and
no Back-Cover Texts. A copy of the license is included in the
section entitled "GNU Free Documentation License".
INFO-DIR-SECTION Astronomy
START-INFO-DIR-ENTRY
* Gnuastro: (gnuastro). GNU Astronomy Utilities.
* libgnuastro: (gnuastro)Gnuastro library. Full Gnuastro library doc.
* help-gnuastro: (gnuastro)help-gnuastro mailing list. Getting help.
* bug-gnuastro: (gnuastro)Report a bug. How to report bugs
* Arithmetic: (gnuastro)Arithmetic. Arithmetic operations on pixels.
* astarithmetic: (gnuastro)Invoking astarithmetic. Options to Arithmetic.
* BuildProgram: (gnuastro)BuildProgram. Compile and run programs using Gnuastro's library.
* astbuildprog: (gnuastro)Invoking astbuildprog. Options to BuildProgram.
* ConvertType: (gnuastro)ConvertType. Convert different file types.
* astconvertt: (gnuastro)Invoking astconvertt. Options to ConvertType.
* Convolve: (gnuastro)Convolve. Convolve an input file with kernel.
* astconvolve: (gnuastro)Invoking astconvolve. Options to Convolve.
* CosmicCalculator: (gnuastro)CosmicCalculator. For cosmological params.
* astcosmiccal: (gnuastro)Invoking astcosmiccal. Options to CosmicCalculator.
* Crop: (gnuastro)Crop. Crop region(s) from image(s).
* astcrop: (gnuastro)Invoking astcrop. Options to Crop.
* Fits: (gnuastro)Fits. View and manipulate FITS extensions and keywords.
* astfits: (gnuastro)Invoking astfits. Options to Fits.
* MakeCatalog: (gnuastro)MakeCatalog. Make a catalog from labeled image.
* astmkcatalog: (gnuastro)Invoking astmkcatalog. Options to MakeCatalog.
* MakeProfiles: (gnuastro)MakeProfiles. Make mock profiles.
* astmkprof: (gnuastro)Invoking astmkprof. Options to MakeProfiles.
* Match: (gnuastro)Match. Match two separate catalogs.
* astmatch: (gnuastro)Invoking astmatch. Options to Match.
* NoiseChisel: (gnuastro)NoiseChisel. Detect signal in noise.
* astnoisechisel: (gnuastro)Invoking astnoisechisel. Options to NoiseChisel.
* Segment: (gnuastro)Segment. Segment detections based on signal structure.
* astsegment: (gnuastro)Invoking astsegment. Options to Segment.
* Query: (gnuastro)Query. Access remote databases for downloading data.
* astquery: (gnuastro)Invoking astquery. Options to Query.
* Statistics: (gnuastro)Statistics. Get image Statistics.
* aststatistics: (gnuastro)Invoking aststatistics. Options to Statistics.
* Table: (gnuastro)Table. Read and write FITS binary or ASCII tables.
* asttable: (gnuastro)Invoking asttable. Options to Table.
* Warp: (gnuastro)Warp. Warp a dataset to a new grid.
* astwarp: (gnuastro)Invoking astwarp. Options to Warp.
* astscript: (gnuastro)Installed scripts. Gnuastro's installed scripts.
* astscript-ds9-region: (gnuastro)Invoking astscript-ds9-region. Options to this script
* astscript-fits-view: (gnuastro)Invoking astscript-fits-view. Options to this script
* astscript-pointing-simulate: (gnuastro)Invoking astscript-pointing-simulate. Options to this script
* astscript-psf-scale-factor: (gnuastro)Invoking astscript-psf-scale-factor. Options to this script
* astscript-psf-select-stars: (gnuastro)Invoking astscript-psf-select-stars. Options to this script
* astscript-psf-stamp: (gnuastro)Invoking astscript-psf-stamp. Options to this script
* astscript-psf-subtract: (gnuastro)Invoking astscript-psf-subtract. Options to this script
* astscript-psf-unite: (gnuastro)Invoking astscript-psf-unite. Options to this script
* astscript-radial-profile: (gnuastro)Invoking astscript-radial-profile. Options to this script
* astscript-sort-by-night: (gnuastro)Invoking astscript-sort-by-night. Options to this script
* astscript-zeropoint: (gnuastro)Invoking astscript-zeropoint. Options to this script
END-INFO-DIR-ENTRY
File: gnuastro.info, Node: NoiseChisel output, Prev: Detection options, Up: Invoking astnoisechisel
7.2.2.3 NoiseChisel output
..........................
NoiseChisel's output is a multi-extension FITS file. The main
extension/dataset is a (binary) detection map. It has the same size as
the input but with only two possible values for all pixels: 0 (for
pixels identified as noise) and 1 (for those identified as
signal/detections). The detection map is followed by a Sky and Sky
standard deviation dataset (which are calculated from the binary image).
By default (when ‘--rawoutput’ is not called), NoiseChisel will also
subtract the Sky value from the input and save the sky-subtracted input
as the first extension in the output with data. The zero-th extension
(that contains no data), contains NoiseChisel's configuration as FITS
keywords, see *note Output FITS files::.
The name of the output file can be set by giving a value to
‘--output’ (this is a common option between all programs and is
therefore discussed in *note Input output options::). If ‘--output’ is
not used, the input name will be suffixed with ‘_detected.fits’ and used
as output, see *note Automatic output::. If any of the options starting
with ‘--check*’ are given, NoiseChisel will not complete and will abort
as soon as the respective check images are created. For more
information on the different check images, see the description for the
‘--check*’ options in *note Detection options:: (this can be disabled
with ‘--continueaftercheck’).
The last two extensions of the output are the Sky and its Standard
deviation, see *note Sky value:: for a complete explanation. They are
calculated on the tile grid that you defined for NoiseChisel. By
default these datasets will have the same size as the input, but with
all the pixels in one tile given one value. To be more space-efficient
(keep only one pixel per tile), you can use the ‘--oneelempertile’
option, see *note Tessellation::.
To inspect any of NoiseChisel's output files, assuming you use SAO
DS9, you can configure your Graphic User Interface (GUI) to open
NoiseChisel's output as a multi-extension data cube. This will allow
you to flip through the different extensions and visually inspect the
results. This process has been described for the GNOME GUI (most common
GUI in GNU/Linux operating systems) in *note Viewing FITS file contents
with DS9 or TOPCAT::.
NoiseChisel's output configuration options are described in detail
below.
‘--continueaftercheck’
Continue NoiseChisel after any of the options starting with
‘--check’ (see *note Detection options::. NoiseChisel involves
many steps and as a result, there are many checks, allowing you to
inspect the status of the processing. The results of each step
affect the next steps of processing. Therefore, when you want to
check the status of the processing at one step, the time spent to
complete NoiseChisel is just wasted/distracting time.
To encourage easier experimentation with the option values, when
you use any of the NoiseChisel options that start with ‘--check’,
NoiseChisel will abort once its desired extensions have been
written. With ‘--continueaftercheck’ option, you can disable this
behavior and ask NoiseChisel to continue with the rest of the
processing, even after the requested check files are complete.
‘--ignoreblankintiles’
Do not set the input's blank pixels to blank in the tiled outputs
(for example, Sky and Sky standard deviation extensions of the
output). This is only applicable when the tiled output has the
same size as the input, in other words, when ‘--oneelempertile’ is
not called.
By default, blank values in the input (commonly on the edges which
are outside the survey/field area) will be set to blank in the
tiled outputs also. But in other scenarios this default behavior
is not desired; for example, if you have masked something in the
input, but want the tiled output under that also.
‘-l’
‘--label’
Run a connected-components algorithm on the finally detected pixels
to identify which pixels are connected to which. By default the
main output is a binary dataset with only two values: 0 (for noise)
and 1 (for signal/detections). See *note NoiseChisel output:: for
more.
The purpose of NoiseChisel is to detect targets that are extended
and diffuse, with outer parts that sink into the noise very
gradually (galaxies and stars for example). Since NoiseChisel digs
down to extremely low surface brightness values, many such targets
will commonly be detected together as a single large body of
connected pixels.
To properly separate connected objects, sophisticated segmentation
methods are commonly necessary on NoiseChisel's output. Gnuastro
has the dedicated *note Segment:: program for this job. Since
input images are commonly large and can take a significant volume,
the extra volume necessary to store the labels of the connected
components in the detection map (which will be created with this
‘--label’ option, in 32-bit signed integer type) can thus be a
major waste of space. Since the default output is just a binary
dataset, an 8-bit unsigned dataset is enough.
The binary output will also encourage users to segment the result
separately prior to doing higher-level analysis. As an alternative
to ‘--label’, if you have the binary detection image, you can use
the ‘connected-components’ operator in Gnuastro's Arithmetic
program to identify regions that are connected with each other.
For example, with this command (assuming NoiseChisel's output is
called ‘nc.fits’):
$ astarithmetic nc.fits 2 connected-components -hDETECTIONS
‘--rawoutput’
Do not include the Sky-subtracted input image as the first
extension of the output. By default, the Sky-subtracted input is
put in the first extension of the output. The next extensions are
NoiseChisel's main outputs described above.
The extra Sky-subtracted input can be convenient in checking
NoiseChisel's output and comparing the detection map with the
input: visually see if everything you expected is detected
(reasonable completeness) and that you do not have too many false
detections (reasonable purity). This visual inspection is
simplified if you use SAO DS9 to view NoiseChisel's output as a
multi-extension data-cube, see *note Viewing FITS file contents
with DS9 or TOPCAT::.
When you are satisfied with your NoiseChisel configuration
(therefore you do not need to check on every run), or you want to
archive/transfer the outputs, or the datasets become large, or you
are running NoiseChisel as part of a pipeline, this Sky-subtracted
input image can be a significant burden (take up a large volume).
The fact that the input is also noisy, makes it hard to compress it
efficiently.
In such cases, this ‘--rawoutput’ can be used to avoid the extra
sky-subtracted input in the output. It is always possible to
easily produce the Sky-subtracted dataset from the input (assuming
it is in extension ‘1’ of ‘in.fits’) and the ‘SKY’ extension of
NoiseChisel's output (let's call it ‘nc.fits’) with a command like
below (assuming NoiseChisel was not run with ‘--oneelempertile’,
see *note Tessellation::):
$ astarithmetic in.fits nc.fits - -h1 -hSKY
*Save space:* with the ‘--rawoutput’ and ‘--oneelempertile’,
NoiseChisel's output will only be one binary detection map and two much
smaller arrays with one value per tile. Since none of these have noise
they can be compressed very effectively (without any loss of data) with
exceptionally high compression ratios. This makes it easy to archive,
or transfer, NoiseChisel's output even on huge datasets. To compress it
with the most efficient method (take up less volume), run the following
command:
$ gzip --best noisechisel_output.fits
The resulting ‘.fits.gz’ file can then be fed into any of Gnuastro's
programs directly, or viewed in viewers like SAO DS9, without having to
decompress it separately (they will just take a little longer, because
they have to internally decompress it before starting). See *note
NoiseChisel optimization for storage:: for an example on a real dataset.
File: gnuastro.info, Node: Segment, Next: MakeCatalog, Prev: NoiseChisel, Up: Data analysis
7.3 Segment
===========
Once signal is separated from noise (for example, with *note
NoiseChisel::), you have a binary dataset: each pixel is either signal
(1) or noise (0). Signal (for example, every galaxy in your image) has
been "detected", but all detections have a label of 1. Therefore while
we know which pixels contain signal, we still cannot find out how many
galaxies they contain or which detected pixels correspond to which
galaxy. At the lowest (most generic) level, detection is a kind of
segmentation (segmenting the whole dataset into signal and noise, see
*note NoiseChisel::). Here, we will define segmentation only on signal:
to separate sub-structure within the detections.
If the targets are clearly separated, or their detected regions are
not touching, a simple connected components(1) algorithm (very basic
segmentation) is enough to separate the regions that are
touching/connected. This is such a basic and simple form of
segmentation that Gnuastro's Arithmetic program has an operator for it:
see ‘connected-components’ in *note Arithmetic operators::. Assuming
the binary dataset is called ‘binary.fits’, you can use it with a
command like this:
$ astarithmetic binary.fits 2 connected-components
You can even do a very basic detection (a threshold, say at value ‘100’)
_and_ segmentation in Arithmetic with a single command like below:
$ astarithmetic in.fits 100 gt 2 connected-components
However, in most astronomical situations our targets are not nicely
separated or have a sharp boundary/edge (for a threshold to suffice):
they touch (for example, merging galaxies), or are simply in the same
line-of-sight (which is much more common). This causes their images to
overlap.
In particular, when you do your detection with NoiseChisel, you will
detect signal to very low surface brightness limits: deep into the faint
wings of galaxies or bright stars (which can extend very far and
irregularly from their center). Therefore, it often happens that
several galaxies are detected as one large detection. Since they are
touching, a simple connected components algorithm will not suffice. It
is therefore necessary to do a more sophisticated segmentation and break
up the detected pixels (even those that are touching) into multiple
target objects as accurately as possible.
Segment will use a detection map and its corresponding dataset to
find sub-structure over the detected areas and use them for its
segmentation. Until Gnuastro version 0.6 (released in 2018), Segment
was part of *note NoiseChisel::. Therefore, similar to NoiseChisel, the
best place to start reading about Segment and understanding what it does
(with many illustrative figures) is Section 3.2 of Akhlaghi and Ichikawa
2015 (https://arxiv.org/abs/1505.01664), and continue with Akhlaghi 2019
(https://arxiv.org/abs/1909.11230).
As a summary, Segment first finds true _clump_s over the detections.
Clumps are associated with local maxima/minima(2) and extend over the
neighboring pixels until they reach a local minimum/maximum
(_river_/_watershed_). By default, Segment will use the distribution of
clump signal-to-noise ratios over the undetected regions as reference to
find "true" clumps over the detections. Using the undetected regions
can be disabled by directly giving a signal-to-noise ratio to
‘--clumpsnthresh’.
The true clumps are then grown to a certain threshold over the
detections. Based on the strength of the connections
(rivers/watersheds) between the grown clumps, they are considered parts
of one _object_ or as separate _object_s. See Section 3.2 of Akhlaghi
and Ichikawa 2015 (https://arxiv.org/abs/1505.01664) for more.
Segment's main output are thus two labeled datasets: 1) clumps, and 2)
objects. See *note Segment output:: for more.
To start learning about Segment, especially in relation to detection
(*note NoiseChisel::) and measurement (*note MakeCatalog::), the
recommended references are Akhlaghi and Ichikawa 2015
(https://arxiv.org/abs/1505.01664), Akhlaghi 2016
(https://arxiv.org/abs/1611.06387) and Akhlaghi 2019
(https://arxiv.org/abs/1909.11230). If you have used Segment within
your research, please run it with ‘--cite’ to list the papers you should
cite and how to acknowledge its funding sources.
Those papers cannot be updated any more but the software will evolve.
For example, Segment became a separate program (from NoiseChisel) in
2018 (after those papers were published). Therefore this book is the
definitive reference. Finally, in *note Invoking astsegment::, we will
discuss Segment's inputs, outputs and configuration options.
* Menu:
* Invoking astsegment:: Inputs, outputs and options to Segment
---------- Footnotes ----------
(1) <https://en.wikipedia.org/wiki/Connected-component_labeling>
(2) By default the maximum is used as the first clump pixel, to
define clumps based on local minima, use the ‘--minima’ option.
File: gnuastro.info, Node: Invoking astsegment, Prev: Segment, Up: Segment
7.3.1 Invoking Segment
----------------------
Segment will identify substructure within the detected regions of an
input image. Segment's output labels can be directly used for
measurements (for example, with *note MakeCatalog::). The executable
name is ‘astsegment’ with the following general template
$ astsegment [OPTION ...] InputImage.fits
One line examples:
## Segment NoiseChisel's detected regions.
$ astsegment default-noisechisel-output.fits
## Use a hand-input S/N value for keeping true clumps
## (avoid finding the S/N using the undetected regions).
$ astsegment nc-out.fits --clumpsnthresh=10
## Inspect all the segmentation steps after changing a parameter.
$ astsegment input.fits --snquant=0.9 --checksegmentaion
## Use the fixed value of 0.01 for the input's Sky standard deviation
## (in the units of the input), and assume all the pixels are a
## detection (for example, a large structure extending over the whole
## image), and only keep clumps with S/N>10 as true clumps.
$ astsegment in.fits --std=0.01 --detection=all --clumpsnthresh=10
If Segment is to do processing (for example, you do not want to get
help, or see the values of each option), at least one input dataset is
necessary along with detection and error information, either as separate
datasets (per-pixel) or fixed values, see *note Segment input::.
Segment shares a large set of common operations with other Gnuastro
programs, mainly regarding input/output, general processing steps, and
general operating modes. To help in a unified experience between all of
Gnuastro's programs, these common operations have the same names and
defined in *note Common options::.
As in all Gnuastro programs, options can also be given to Segment in
configuration files. For a thorough description of Gnuastro's
configuration file parsing, please see *note Configuration files::. All
of Segment's options with a short description are also always available
on the command-line with the ‘--help’ option, see *note Getting help::.
To inspect the option values without actually running Segment, append
your command with ‘--printparams’ (or ‘-P’).
To help in easy navigation between Segment's options, they are
separately discussed in the three sub-sections below: *note Segment
input:: discusses how you can customize the inputs to Segment. *note
Segmentation options:: is devoted to options specific to the high-level
segmentation process. Finally, in *note Segment output::, we will
discuss options that affect Segment's output.
* Menu:
* Segment input:: Input files and options.
* Segmentation options:: Parameters of the segmentation process.
* Segment output:: Outputs of Segment
File: gnuastro.info, Node: Segment input, Next: Segmentation options, Prev: Invoking astsegment, Up: Invoking astsegment
7.3.1.1 Segment input
.....................
Besides the input dataset (for example, astronomical image), Segment
also needs to know the Sky standard deviation and the regions of the
dataset that it should segment. The values dataset is assumed to be Sky
subtracted by default. If it is not, you can ask Segment to subtract
the Sky internally by calling ‘--sky’. For the rest of this discussion,
we will assume it is already sky subtracted.
The Sky and its standard deviation can be a single value (to be used
for the whole dataset) or a separate dataset (for a separate value per
pixel). If a dataset is used for the Sky and its standard deviation,
they must either be the size of the input image, or have a single value
per tile (generated with ‘--oneelempertile’, see *note Processing
options:: and *note Tessellation::).
The detected regions/pixels can be specified as a detection map (for
example, see *note NoiseChisel output::). If ‘--detection=all’, Segment
will not read any detection map and assume the whole input is a single
detection. For example, when the dataset is fully covered by a large
nearby galaxy/globular cluster.
When dataset are to be used for any of the inputs, Segment will
assume they are multiple extensions of a single file by default (when
‘--std’ or ‘--detection’ are not called). For example, NoiseChisel's
default output *note NoiseChisel output::. When the Sky-subtracted
values are in one file, and the detection and Sky standard deviation are
in another, you just need to use ‘--detection’: in the absence of
‘--std’, Segment will look for both the detection labels and Sky
standard deviation in the file given to ‘--detection’. Ultimately, if
all three are in separate files, you need to call both ‘--detection’ and
‘--std’.
The extensions of the three mandatory inputs can be specified with
‘--hdu’, ‘--dhdu’, and ‘--stdhdu’. For a full discussion on what to
give to these options, see the description of ‘--hdu’ in *note Input
output options::. To see their default values (along with all the other
options), run Segment with the ‘--printparams’ (or ‘-P’) option. Just
recall that in the absence of ‘--detection’ and ‘--std’, all three are
assumed to be in the same file. If you only want to see Segment's
default values for HDUs on your system, run this command:
$ astsegment -P | grep hdu
By default Segment will convolve the input with a kernel to improve
the signal-to-noise ratio of true peaks. If you already have the
convolved input dataset, you can pass it directly to Segment for faster
processing (using the ‘--convolved’ and ‘--chdu’ options). Just do not
forget that the convolved image must also be Sky-subtracted before
calling Segment. If a value/file is given to ‘--sky’, the convolved
values will also be Sky subtracted internally. Alternatively, if you
prefer to give a kernel (with ‘--kernel’ and ‘--khdu’), Segment can do
the convolution internally. To disable convolution, use
‘--kernel=none’.
‘--sky=STR/FLT’
The Sky value(s) to subtract from the input. This option can
either be given a constant number or a file name containing a
dataset (multiple values, per pixel or per tile). By default,
Segment will assume the input dataset is Sky subtracted, so this
option is not mandatory.
If the value cannot be read as a number, it is assumed to be a file
name. When the value is a file, the extension can be specified
with ‘--skyhdu’. When it is not a single number, the given dataset
must either have the same size as the output or the same size as
the tessellation (so there is one pixel per tile, see *note
Tessellation::).
When this option is given, its value(s) will be subtracted from the
input and the (optional) convolved dataset (given to ‘--convolved’)
prior to starting the segmentation process.
‘--skyhdu=STR/INT’
The HDU/extension containing the Sky values. This is mandatory
when the value given to ‘--sky’ is not a number. Please see the
description of ‘--hdu’ in *note Input output options:: for the
different ways you can identify a special extension.
‘--std=STR/FLT’
The Sky standard deviation value(s) corresponding to the input.
The value can either be a constant number or a file name containing
a dataset (multiple values, per pixel or per tile). The Sky
standard deviation is mandatory for Segment to operate.
If the value cannot be read as a number, it is assumed to be a file
name. When the value is a file, the extension can be specified
with ‘--skyhdu’. When it is not a single number, the given dataset
must either have the same size as the output or the same size as
the tessellation (so there is one pixel per tile, see *note
Tessellation::).
When this option is not called, Segment will assume the standard
deviation is a dataset and in a HDU/extension (‘--stdhdu’) of
another one of the input file(s). If a file is given to
‘--detection’, it will assume that file contains the standard
deviation dataset, otherwise, it will look into input filename (the
main argument, without any option).
‘--stdhdu=INT/STR’
The HDU/extension containing the Sky standard deviation values,
when the value given to ‘--std’ is a file name. Please see the
description of ‘--hdu’ in *note Input output options:: for the
different ways you can identify a special extension.
‘--variance’
The input Sky standard deviation value/dataset is actually
variance. When this option is called, the square root of input Sky
standard deviation (see ‘--std’) is used internally, not its raw
value(s).
‘-d FITS’
‘--detection=FITS’
Detection map to use for segmentation. If given a value of ‘all’,
Segment will assume the whole dataset must be segmented, see below.
If a detection map is given, the extension can be specified with
‘--dhdu’. If not given, Segment will assume the desired
HDU/extension is in the main input argument (input file specified
with no option).
The final segmentation (clumps or objects) will only be over the
non-zero pixels of this detection map. The dataset must have the
same size as the input image. Only datasets with an integer type
are acceptable for the labeled image, see *note Numeric data
types::. If your detection map only has integer values, but it is
stored in a floating point container, you can use Gnuastro's
Arithmetic program (see *note Arithmetic::) to convert it to an
integer container, like the example below:
$ astarithmetic float.fits int32 --output=int.fits
It may happen that the whole input dataset is covered by signal,
for example, when working on parts of the Andromeda galaxy, or
nearby globular clusters (that cover the whole field of view). In
such cases, segmentation is necessary over the complete dataset,
not just specific regions (detections). By default Segment will
first use the undetected regions as a reference to find the proper
signal-to-noise ratio of "true" clumps (give a purity level
specified with ‘--snquant’). Therefore, in such scenarios you also
need to manually give a "true" clump signal-to-noise ratio with the
‘--clumpsnthresh’ option to disable looking into the undetected
regions, see *note Segmentation options::. In such cases, is
possible to make a detection map that only has the value ‘1’ for
all pixels (for example, using *note Arithmetic::), but for
convenience, you can also use ‘--detection=all’.
‘--dhdu’
The HDU/extension containing the detection map given to
‘--detection’. Please see the description of ‘--hdu’ in *note
Input output options:: for the different ways you can identify a
special extension.
‘-k FITS’
‘--kernel=FITS’
The name of file containing kernel that will be used to convolve
the input image. The usage of this option is identical to
NoiseChisel's ‘--kernel’ option (*note NoiseChisel input::).
Please see the descriptions there for more. To disable
convolution, you can give it a value of ‘none’.
‘--khdu’
The HDU/extension containing the kernel used for convolution. For
acceptable values, please see the description of ‘--hdu’ in *note
Input output options::.
‘--convolved=FITS’
The convolved image's file name to avoid internal convolution by
Segment. The usage of this option is identical to NoiseChisel's
‘--convolved’ option. Please see *note NoiseChisel input:: for a
thorough discussion of the usefulness and best practices of using
this option.
If you want to use the same convolution kernel for detection (with
*note NoiseChisel::) and segmentation, with this option, you can
use the same convolved image (that is also available in
NoiseChisel) and avoid two convolutions. However, just be careful
to use the input to NoiseChisel as the input to Segment also, then
use the ‘--sky’ and ‘--std’ to specify the Sky and its standard
deviation (from NoiseChisel's output). Recall that when
NoiseChisel is not called with ‘--rawoutput’, the first extension
of NoiseChisel's output is the _Sky-subtracted_ input (see *note
NoiseChisel output::). So if you use the same convolved image that
you fed to NoiseChisel, but use NoiseChisel's output with Segment's
‘--convolved’, then the convolved image will not be Sky subtracted.
‘--chdu’
The HDU/extension containing the convolved image (given to
‘--convolved’). For acceptable values, please see the description
of ‘--hdu’ in *note Input output options::.
‘-L INT[,INT]’
‘--largetilesize=INT[,INT]’
The size of the large tiles to use for identifying the clump S/N
threshold over the undetected regions. The usage of this option is
identical to NoiseChisel's ‘--largetilesize’ option (*note
NoiseChisel input::). Please see the descriptions there for more.
The undetected regions can be a significant fraction of the dataset
and finding clumps requires sorting of the desired regions, which
can be slow. To speed up the processing, Segment finds clumps in
the undetected regions over separate large tiles. This allows it
to have to sort a much smaller set of pixels and also to treat them
independently and in parallel. Both these issues greatly speed it
up. Just be sure to not decrease the large tile sizes too much
(less than 100 pixels in each dimension). It is important for them
to be much larger than the clumps.
File: gnuastro.info, Node: Segmentation options, Next: Segment output, Prev: Segment input, Up: Invoking astsegment
7.3.1.2 Segmentation options
............................
The options below can be used to configure every step of the
segmentation process in the Segment program. For a more complete
explanation (with figures to demonstrate each step), please see Section
3.2 of Akhlaghi and Ichikawa 2015 (https://arxiv.org/abs/1505.01664),
and also *note Segment::. By default, Segment will follow the procedure
described in the paper to find the S/N threshold based on the noise
properties. This can be disabled by directly giving a trustable
signal-to-noise ratio to the ‘--clumpsnthresh’ option.
Recall that you can always see the full list of Gnuastro's options
with the ‘--help’ (see *note Getting help::), or ‘--printparams’ (or
‘-P’) to see their values (see *note Operating mode options::).
‘-B FLT’
‘--minskyfrac=FLT’
Minimum fraction (value between 0 and 1) of Sky (undetected) areas
in a large tile. Only (large) tiles with a fraction of undetected
pixels (Sky) greater than this value will be used for finding
clumps. The clumps found in the undetected areas will be used to
estimate a S/N threshold for true clumps. Therefore this is an
important option (to decrease) in crowded fields. Operationally,
this is almost identical to NoiseChisel's ‘--minskyfrac’ option
(*note Detection options::). Please see the descriptions there for
more.
‘--minima’
Build the clumps based on the local minima, not maxima. By
default, clumps are built starting from local maxima (see Figure 8
of Akhlaghi and Ichikawa 2015 (https://arxiv.org/abs/1505.01664)).
Therefore, this option can be useful when you are searching for
true local minima (for example, absorption features).
‘-m INT’
‘--snminarea=INT’
The minimum area which a clump in the undetected regions should
have in order to be considered in the clump Signal to noise ratio
measurement. If this size is set to a small value, the Signal to
noise ratio of false clumps will not be accurately found. It is
recommended that this value be larger than the value to
NoiseChisel's ‘--snminarea’. Because the clumps are found on the
convolved (smoothed) image while the pseudo-detections are found on
the input image. You can use ‘--checksn’ and ‘--checksegmentation’
to see if your chosen value is reasonable or not.
‘--checksn’
Save the S/N values of the clumps over the sky and detected regions
into separate tables. If ‘--tableformat’ is a FITS format, each
table will be written into a separate extension of one file
suffixed with ‘_clumpsn.fits’. If it is plain text, a separate
file will be made for each table (ending in ‘_clumpsn_sky.txt’ and
‘_clumpsn_det.txt’). For more on ‘--tableformat’ see *note Input
output options::.
You can use these tables to inspect the S/N values and their
distribution (in combination with the ‘--checksegmentation’ option
to see where the clumps are). You can use Gnuastro's *note
Statistics:: to make a histogram of the distribution (ready for
plotting in a text file, or a crude ASCII-art demonstration on the
command-line).
With this option, Segment will abort as soon as the two tables are
created. This allows you to inspect the steps leading to the final
S/N quantile threshold, this behavior can be disabled with
‘--continueaftercheck’.
‘--minnumfalse=INT’
The minimum number of clumps over undetected (Sky) regions to
identify the requested Signal-to-Noise ratio threshold.
Operationally, this is almost identical to NoiseChisel's
‘--minnumfalse’ option (*note Detection options::). Please see the
descriptions there for more.
‘-c FLT’
‘--snquant=FLT’
The quantile of the signal-to-noise ratio distribution of clumps in
undetected regions, used to define true clumps. After identifying
all the usable clumps in the undetected regions of the dataset, the
given quantile of their signal-to-noise ratios is used to define
the signal-to-noise ratio of a "true" clump. Effectively, this can
be seen as an inverse p-value measure. See Figure 9 and Section
3.2.1 of Akhlaghi and Ichikawa 2015
(https://arxiv.org/abs/1505.01664) for a complete explanation. The
full distribution of clump signal-to-noise ratios over the
undetected areas can be saved into a table with ‘--checksn’ option
and visually inspected with ‘--checksegmentation’.
‘-v’
‘--keepmaxnearriver’
Keep a clump whose maximum (minimum if ‘--minima’ is called) flux
is 8-connected to a river pixel. By default such clumps over
detections are considered to be noise and are removed irrespective
of their significance measure; see Akhlaghi 2019
(https://arxiv.org/abs/1909.11230). Over large profiles, that sink
into the noise very slowly, noise can cause part of the profile
(which was flat without noise) to become a very large and with a
very high Signal to noise ratio. In such cases, the pixel with the
maximum flux in the clump will be immediately touching a river
pixel.
‘-s FLT’
‘--clumpsnthresh=FLT’
The signal-to-noise threshold for true clumps. If this option is
given, then the segmentation options above will be ignored and the
given value will be directly used to identify true clumps over the
detections. This can be useful if you have a large dataset with
similar noise properties. You can find a robust signal-to-noise
ratio based on a (sufficiently large) smaller portion of the
dataset. Afterwards, with this option, you can speed up the
processing on the whole dataset. Other scenarios where this option
may be useful is when, the image might not contain enough/any Sky
regions.
‘-G FLT’
‘--gthresh=FLT’
Threshold (multiple of the sky standard deviation added with the
sky) to stop growing true clumps. Once true clumps are found, they
are set as the basis to segment the detected region. They are
grown until the threshold specified by this option.
‘-y INT’
‘--minriverlength=INT’
The minimum length of a river between two grown clumps for it to be
considered in signal-to-noise ratio estimations. Similar to
‘--snminarea’, if the length of the river is too short, the
signal-to-noise ratio can be noisy and unreliable. Any existing
rivers shorter than this length will be considered as non-existent,
independent of their Signal to noise ratio. The clumps are grown
on the input image, therefore this value can be smaller than the
value given to ‘--snminarea’. Recall that the clumps were defined
on the convolved image so ‘--snminarea’ should be larger.
‘-O FLT’
‘--objbordersn=FLT’
The maximum Signal to noise ratio of the rivers between two grown
clumps in order to consider them as separate 'objects'. If the
Signal to noise ratio of the river between two grown clumps is
larger than this value, they are defined to be part of one
'object'. Note that the physical reality of these 'objects' can
never be established with one image, or even multiple images from
one broad-band filter. Any method we devise to define 'object's
over a detected region is ultimately subjective.
Two very distant galaxies or satellites in one halo might lie in
the same line of sight and be detected as clumps on one detection.
On the other hand, the connection (through a spiral arm or tidal
tail for example) between two parts of one galaxy might have such a
low surface brightness that they are broken up into multiple
detections or objects. In fact if you have noticed, exactly for
this purpose, this is the only Signal to noise ratio that the user
gives into NoiseChisel. The 'true' detections and clumps can be
objectively identified from the noise characteristics of the image,
so you do not have to give any hand input Signal to noise ratio.
‘--checksegmentation’
A file with the suffix ‘_seg.fits’ will be created. This file
keeps all the relevant steps in finding true clumps and segmenting
the detections into multiple objects in various extensions. Having
read the paper or the steps above. Examining this file can be an
excellent guide in choosing the best set of parameters. Note that
calling this function will significantly slow NoiseChisel. In
verbose mode (without the ‘--quiet’ option, see *note Operating
mode options::) the important steps (along with their extension
names) will also be reported.
With this option, NoiseChisel will abort as soon as the two tables
are created. This behavior can be disabled with
‘--continueaftercheck’.
File: gnuastro.info, Node: Segment output, Prev: Segmentation options, Up: Invoking astsegment
7.3.1.3 Segment output
......................
The main output of Segment are two label datasets (with integer types,
separating the dataset's elements into different classes). They have
HDU/extension names of ‘CLUMPS’ and ‘OBJECTS’.
Similar to all Gnuastro's FITS outputs, the zero-th extension/HDU of
the main output file only contains header keywords and image or table.
It contains the Segment input files and parameters (option names and
values) as FITS keywords. Note that if an option name is longer than 8
characters, the keyword name is the second word. The first word is
‘HIERARCH’. Also note that according to the FITS standard, the keyword
names must be in capital letters, therefore, if you want to use Grep to
inspect these keywords, use the ‘-i’ option, like the example below.
$ astfits image_segmented.fits -h0 | grep -i snquant
By default, besides the ‘CLUMPS’ and ‘OBJECTS’ extensions, Segment's
output will also contain the (technically redundant) sky-subtracted
input dataset (‘INPUT-NO-SKY’) and the sky standard deviation dataset
(‘SKY_STD’, if it was not a constant number). This can help in visually
inspecting the result when viewing the images as a "Multi-extension data
cube" in SAO DS9 for example, (see *note Viewing FITS file contents with
DS9 or TOPCAT::). You can simply flip through the extensions and see
the same region of the image and its corresponding clumps/object labels.
It also makes it easy to feed the output (as one file) into MakeCatalog
when you intend to make a catalog afterwards (see *note MakeCatalog::.
To remove these redundant extensions from the output (for example, when
designing a pipeline), you can use ‘--rawoutput’.
The ‘OBJECTS’ and ‘CLUMPS’ extensions can be used as input into *note
MakeCatalog:: to generate a catalog for higher-level analysis. If you
want to treat each clump separately, you can give a very large value (or
even a NaN, which will always fail) to the ‘--gthresh’ option (for
example, ‘--gthresh=1e10’ or ‘--gthresh=nan’), see *note Segmentation
options::.
For a complete definition of clumps and objects, please see Section
3.2 of Akhlaghi and Ichikawa 2015 (https://arxiv.org/abs/1505.01664) and
*note Segmentation options::. The clumps are "true" local maxima
(minima if ‘--minima’ is called) and their surrounding pixels until a
local minimum/maximum (caused by noise fluctuations, or another "true"
clump). Therefore it may happen that some of the input detections are
not covered by clumps at all (very diffuse objects without any strong
peak), while some objects may contain many clumps. Even in those that
have clumps, there will be regions that are too diffuse. The diffuse
regions (within the input detected regions) are given a negative label
(-1) to help you separate them from the undetected regions (with a value
of zero).
Each clump is labeled with respect to its host object. Therefore, if
an object has three clumps for example, the clumps within it have labels
1, 2 and 3. As a result, if an initial detected region has multiple
objects, each with a single clump, all the clumps will have a label of
1. The total number of clumps in the dataset is stored in the ‘NCLUMPS’
keyword of the ‘CLUMPS’ extension and printed in the verbose output of
Segment (when ‘--quiet’ is not called).
The ‘OBJECTS’ extension of the output will give a positive
counter/label to every detected pixel in the input. As described in
Akhlaghi and Ichikawa [2015], the true clumps are grown until a certain
threshold. If the grown clumps touch other clumps and the connection is
strong enough, they are considered part of the same _object_. Once
objects (grown clumps) are identified, they are grown to cover the whole
detected area.
The options to configure the output of Segment are listed below:
‘--continueaftercheck’
Do not abort Segment after producing the check image(s). The usage
of this option is identical to NoiseChisel's ‘--continueaftercheck’
option (*note NoiseChisel input::). Please see the descriptions
there for more.
‘--noobjects’
Abort Segment after finding true clumps and do not continue with
finding options. Therefore, no ‘OBJECTS’ extension will be present
in the output. Each true clump in ‘CLUMPS’ will get a unique
label, but diffuse regions will still have a negative value.
To make a catalog of the clumps, the input detection map (where all
the labels are one) can be fed into *note MakeCatalog:: along with
the input detection map to Segment (that only had a value of ‘1’
for all detected pixels) with ‘--clumpscat’. In this way,
MakeCatalog will assume all the clumps belong to a single "object".
‘--grownclumps’
In the output ‘CLUMPS’ extension, store the grown clumps. If a
detected region contains no clumps or only one clump, then it will
be fully given a label of ‘1’ (no negative valued pixels).
‘--rawoutput’
Only write the ‘CLUMPS’ and ‘OBJECTS’ datasets in the output file.
Without this option (by default), the first and last extensions of
the output will the Sky-subtracted input dataset and the Sky
standard deviation dataset (if it was not a number). When the
datasets are small, these redundant extensions can make it
convenient to inspect the results visually or feed the output to
*note MakeCatalog:: for measurements. Ultimately both the input
and Sky standard deviation datasets are redundant (you had them
before running Segment). When the inputs are large/numerous, these
extra dataset can be a burden.
*Save space:* with the ‘--rawoutput’, Segment's output will only be two
labeled datasets (only containing integers). Since they have no noise,
such datasets can be compressed very effectively (without any loss of
data) with exceptionally high compression ratios. You can use the
following command to compress it with the best ratio:
$ gzip --best segment_output.fits
The resulting ‘.fits.gz’ file can then be fed into any of Gnuastro's
programs directly, without having to decompress it separately (it will
just take them a little longer, because they have to decompress it
internally before use).
When the input is a 2D image, to inspect NoiseChisel's output you can
configure SAO DS9 in your Graphic User Interface (GUI) to open
NoiseChisel's output as a multi-extension data cube. This will allow
you to flip through the different extensions and visually inspect the
results. This process has been described for the GNOME GUI (most common
GUI in GNU/Linux operating systems) in *note Viewing FITS file contents
with DS9 or TOPCAT::.
File: gnuastro.info, Node: MakeCatalog, Next: Match, Prev: Segment, Up: Data analysis
7.4 MakeCatalog
===============
At the lowest level, a dataset (for example, an image) is just a
collection of values, placed after each other in any number of
dimensions (for example, an image is a 2D dataset). Each data-element
(pixel) just has two properties: its position (relative to the rest) and
its value. In higher-level analysis, an entire dataset (an image for
example) is rarely treated as a singular entity(1). You usually want to
know/measure the properties of the (separate) scientifically interesting
targets that are embedded in it. For example, the magnitudes, positions
and elliptical properties of the galaxies that are in the image.
MakeCatalog is Gnuastro's program for localized measurements over a
dataset. In other words, MakeCatalog is Gnuastro's program to convert
low-level datasets (like images), to high level catalogs. The role of
MakeCatalog in a scientific analysis and the benefits of its model
(where detection/segmentation is separated from measurement) is
discussed in Akhlaghi 2016 (https://arxiv.org/abs/1611.06387v1)(2) and
summarized in *note Detection and catalog production::. We strongly
recommend reading this short paper for a better understanding of this
methodology. Understanding the effective usage of MakeCatalog, will
thus also help effective use of other (lower-level) Gnuastro's programs
like *note NoiseChisel:: or *note Segment::.
It is important to define your regions of interest for measurements
_before_ running MakeCatalog. MakeCatalog is specialized in doing
measurements accurately and efficiently. Therefore MakeCatalog will not
do detection, segmentation, or defining apertures on requested positions
in your dataset. Following Gnuastro's modularity principle, there are
separate and highly specialized and customizable programs in Gnuastro
for these other jobs as shown below (for a usage example in a real-world
analysis, see *note General program usage tutorial:: and *note Detecting
large extended targets::).
• *note Arithmetic::: Detection with a simple threshold.
• *note NoiseChisel::: Advanced detection.
• *note Segment::: Segmentation (substructure over detections).
• *note MakeProfiles::: Aperture creation for known positions.
These programs will/can return labeled dataset(s) to be fed into
MakeCatalog. A labeled dataset for measurement has the same
size/dimensions as the input, but with integer valued pixels that have
the label/counter for each sub-set of pixels that must be measured
together. For example, all the pixels covering one galaxy in an image,
get the same label.
The requested measurements are then done on similarly labeled pixels.
The final result is a catalog where each row corresponds to the
measurements on pixels with a specific label. For example, the flux
weighted average position of all the pixels with a label of 42 will be
written into the 42nd row of the output catalog/table's central position
column(3). Similarly, the sum of all these pixels will be the 42nd row
in the sum column, etc. Pixels with labels equal to, or smaller than,
zero will be ignored by MakeCatalog. In other words, the number of rows
in MakeCatalog's output is already known before running it (the maximum
value of the labeled dataset).
Before getting into the details of running MakeCatalog (in *note
Invoking astmkcatalog::, we will start with a discussion on the basics
of its approach to separating detection from measurements in *note
Detection and catalog production::. A very important factor in any
measurement is understanding its validity range, or limits. Therefore
in *note Quantifying measurement limits::, we will discuss how to
estimate the reliability of the detection and basic measurements. This
section will continue with a derivation of elliptical parameters from
the labeled datasets in *note Measuring elliptical parameters::. For
those who feel MakeCatalog's existing measurements/columns are not
enough and would like to add further measurements, in *note Adding new
columns to MakeCatalog::, a checklist of steps is provided for readily
adding your own new measurements/columns.
* Menu:
* Detection and catalog production:: Discussing why/how to treat these separately
* Brightness flux magnitude:: More on Magnitudes, surface brightness, etc.
* Quantifying measurement limits:: For comparing different catalogs.
* Measuring elliptical parameters:: Estimating elliptical parameters.
* Adding new columns to MakeCatalog:: How to add new columns.
* MakeCatalog measurements:: List of all the measurements/columns by MakeCatalog.
* Invoking astmkcatalog:: Options and arguments to MakeCatalog.
---------- Footnotes ----------
(1) You can derive the over-all properties of a complete dataset (1D
table column, 2D image, or 3D data-cube) treated as a single entity with
Gnuastro's Statistics program (see *note Statistics::).
(2) A published paper cannot undergo any more change, so this manual
is the definitive guide.
(3) See *note Measuring elliptical parameters:: for a discussion on
this and the derivation of positional parameters, which includes the
center.
File: gnuastro.info, Node: Detection and catalog production, Next: Brightness flux magnitude, Prev: MakeCatalog, Up: MakeCatalog
7.4.1 Detection and catalog production
--------------------------------------
Most existing common tools in low-level astronomical data-analysis (for
example, SExtractor(1)) merge the two processes of detection and
measurement (catalog production) in one program. However, in light of
Gnuastro's modularized approach (modeled on the Unix system) detection
is separated from measurements and catalog production. This modularity
is therefore new to many experienced astronomers and deserves a short
review here. Further discussion on the benefits of this methodology can
be seen in Akhlaghi 2016 (https://arxiv.org/abs/1611.06387v1).
As discussed in the introduction of *note MakeCatalog::, detection
(identifying which pixels to do measurements on) can be done with
different programs. Their outputs (a labeled dataset) can be directly
fed into MakeCatalog to do the measurements and write the result as a
catalog/table. Beyond that, Gnuastro's modular approach has many
benefits that will become clear as you get more experienced in
astronomical data analysis and want to be more creative in using your
valuable data for the exciting scientific project you are working on.
In short the reasons for this modularity can be classified as below:
• Simplicity/robustness of independent, modular tools: making a
catalog is a logically separate process from labeling (detection,
segmentation, or aperture production). A user might want to do
certain operations on the labeled regions before creating a catalog
for them. Another user might want the properties of the same
pixels/objects in another image (another filter for example) to
measure the colors or SED fittings.
Here is an example of doing both: suppose you have images in
various broad band filters at various resolutions and orientations.
The image of one color will thus not lie exactly on another or even
be in the same scale. However, it is imperative that the same
pixels be used in measuring the colors of galaxies.
To solve the problem, NoiseChisel can be run on the reference image
to generate the labeled detection image. Afterwards, the labeled
image can be warped into the grid of the other color (using *note
Warp::). MakeCatalog will then generate the same catalog for both
colors (with the different labeled images). It is currently
customary to warp the images to the same pixel grid, however,
modification of the scientific dataset is very harmful for the data
and creates correlated noise. It is much more accurate to do the
transformations on the labeled image.
• Complexity of a monolith: Adding in a catalog functionality to the
detector program will add several more steps (and many more
options) to its processing that can equally well be done outside of
it. This makes following what the program does harder for the
users and developers, it can also potentially add many bugs.
As an example, if the parameter you want to measure over one
profile is not provided by the developers of MakeCatalog. You can
simply open this tiny little program and add your desired
calculation easily. This process is discussed in *note Adding new
columns to MakeCatalog::. However, if making a catalog was part of
NoiseChisel for example, adding a new column/measurement would
require a lot of energy to understand all the steps and internal
structures of that huge program. It might even be so intertwined
with its processing, that adding new columns might cause
problems/bugs in its primary job (detection).
---------- Footnotes ----------
(1) <https://www.astromatic.net/software/sextractor>
File: gnuastro.info, Node: Brightness flux magnitude, Next: Quantifying measurement limits, Prev: Detection and catalog production, Up: MakeCatalog
7.4.2 Brightness, Flux, Magnitude and Surface brightness
--------------------------------------------------------
Astronomical data pixels are usually in units of counts(1) or electrons
or either one divided by seconds. To convert from the counts to
electrons, you will need to know the instrument gain. In any case, they
can be directly converted to energy or energy/time using the basic
hardware (telescope, camera and filter) information (that is summarized
in the _zero point_, and we will discuss below). We will continue the
discussion assuming the pixels are in units of energy/time.
Brightness
The _brightness_ of an object is defined as its measured energy in
units of time. If our detector pixels directly measured the energy
from the astronomical object(2), then the brightness would be the
total sum of pixel values (energy) associated to the object,
divided by the exposure time. The _flux_ of an object is defined
in units of energy/time/collecting-area. For an astronomical
target, the flux is therefore defined as its brightness divided by
the area used to collect the light from the source; or the
telescope aperture (for example, in units of $cm^2$). Knowing the
flux ($f$) and distance to the object ($r$), we can define its
_luminosity_: $L=4{\pi}r^2f$.
Therefore, while flux and luminosity are intrinsic properties of
the object, brightness depends on our detecting tools (hardware and
software). In low-level observational astronomy data analysis, we
are usually more concerned with measuring the brightness, because
it is the thing we directly measure from the image pixels and
create in catalogs. On the other hand, luminosity is used in
higher-level analysis (after image contents are measured as
catalogs to deduce physical interpretations, because high-level
things like distance/redshift need to be calculated). At this
stage, it is just important avoid confusion between luminosity and
brightness because both have the same units of energy per seconds.
Magnitude
Images of astronomical objects span over a very large range of
brightness: the Sun (as the brightest object) is roughly
$2.5^{60}=10^{24}$ times brighter than the fainter galaxies we can
currently detect in the deepest images. Therefore discussing
brightness directly will involve a large range of values which is
inconvenient. So astronomers have chosen to use a logarithmic
scale for the brightness of astronomical objects.
But the logarithm can only be usable with a dimensionless value
that is always positive. Fortunately brightness is always positive
(at least in theory(3)). To remove the dimensions, we divide the
brightness of the object ($B$) by a reference brightness ($B_r$).
We then define a logarithmic scale as $magnitude$ through the
relation below. The $-2.5$ factor in the definition of magnitudes
is a legacy of the our ancient colleagues and in particular
Hipparchus of Nicaea (190-120 BC).
$$m-m_r=-2.5\log_{10} \left( B \over B_r \right)$$
$m$ is defined as the magnitude of the object and $m_r$ is the
pre-defined magnitude of the reference brightness. For estimating
the error in measuring a magnitude, see *note Quantifying
measurement limits::.
Zero point
A unique situation in the magnitude equation above occurs when the
reference brightness is unity ($B_r=1$). This brightness will thus
summarize all the hardware-specific parameters discussed above
(like the conversion of pixel values to physical units) into one
number. That reference magnitude is commonly known as the _Zero
point_ magnitude because when $B=B_r=1$, the right side of the
magnitude definition above will be zero. Using the zero point
magnitude ($Z$), we can write the magnitude relation above in a
more simpler format:
$$m = -2.5\log_{10}(B) + Z$$
Gnuastro has an installed script to estimate the zero point of any
image, see *note Zero point estimation:: (it contains practical
tutorials to help you get started fast). Having the zero point of
an image, you can convert its pixel values to physical units like
microJanskys (or $\mu{}Jy$). This enables direct pixel-based
comparisons with images from other instruments(4). Jansky is a
commonly used unit for measuring spectral flux density and one
Jansky is equivalent to $10^{-26} W/m^2/Hz$ (watts per square meter
per hertz).
This conversion can be done with the fact that in the AB magnitude
standard(5), $3631Jy$ corresponds to the zero-th magnitude,
therefore $B\equiv3631\times10^{6}\mu{Jy}$ and $m\equiv0$. We can
therefore estimate the brightness ($B_z$, in $\mu{Jy}$)
corresponding to the image zero point ($Z$) using this equation:
$$m - Z = -2.5\log_{10}(B/B_z)$$
$$0 - Z = -2.5\log_{10}({3631\times10^{6}\over B_z})$$
$$B_z = 3631\times10^{\left(6 - {Z \over 2.5} \right)} \mu{Jy}$$
Because the image zero point corresponds to a pixel value of $1$,
the $B_z$ value calculated above also corresponds to a pixel value
of $1$. Therefore you simply have to multiply your image by $B_z$
to convert it to $\mu{Jy}$. Do not forget that this only applies
when your zero point was also estimated in the AB magnitude system.
On the command-line, you can estimate this value for a certain zero
point with AWK, then multiply it to all the pixels in the image
with *note Arithmetic::. For example, let's assume you are using
an SDSS image with a zero point of 22.5:
bz=$(echo 22.5 | awk '{print 3631 * 10^(6-$1/2.5)}')
astarithmetic sdss.fits $bz x --output=sdss-in-muJy.fits
But in Gnuastro, it gets even easier: Arithmetic has an operator
called ‘counts-to-jy’. This will directly convert your image
pixels (in units of counts) to Janskys though a provided AB
Magnitude-based zero point like below. See *note Arithmetic
operators:: for more.
$ astarithmetic sdss.fits 22.5 counts-to-jy
*Be careful with the exposure time:* as described at the start of
this section, we are assuming your data are in units of counts/sec.
As a result, the counts you get from the command above, are only
for one second of exposure! Please see the discussion below in
"Magnitude to counts" for more.
Magnitude to counts (accounting for exposure time)
Until now, we had assumed that the data are in units of counts/sec.
As a result, the equations given above (in the "Zero point" item to
convert magnitudes to pixel counts), give the count level for the
reference (1 second) exposure. But we rarely take 1 second
exposures! It is therefore very important to take the exposure
time into account in scenarios like simulating observations with
varying exposure times (where you need to know how many counts the
object of a certain magnitude will add to a certain image with a
certain exposure time).
To clarify the concept, let's define $C$ as the _counted_ electrons
(which has a linear relation with the photon energy entering the
CCD pixel). In this case, if an object of brightness $B$ is
observed for $t$ seconds, it will accumulate $C=B\times t$
counts(6). Therefore, the generic magnitude equation above can be
written as:
$$m = -2.5\log_{10}(B) + Z = -2.5\log_{10}(C/t) + Z$$
From this, we can derive $C(t)$ in relation to $C(1)$, or counts
from a 1 second exposure, using this relation:
$$C(t) = t\times10^{(m-Z)/2.5} = t\times C(1)$$
In other words, you should simply multiply the counts for one
second with the number of observed seconds.
Another approach is to shift the time-dependence of the counts into
the zero point (after all exposure time is also a hardware issue).
Let's derive the equation below:
$$m = -2.5\log_{10}(C/t) + Z = -2.5\log_{10}(C) + 2.5\log_{10}(t) + Z$$
Therefore, defining an exposure-time-dependent zero point as
$Z(t)$, we can directly correlate a certain object's magnitude with
counts after an exposure of $t$ seconds:
$$m = -2.5\log_{10}(C) + Z(t) \quad\rm{where}\quad Z(t)=Z + 2.5\log_{10}(t)$$
This solution is useful in programs like *note MakeCatalog:: or
*note MakeProfiles::, when you cannot (or do not want to: because
of the extra storage/speed costs) manipulate the values image (for
example, divide it by the exposure time to use a counts/sec zero
point).
Surface brightness
Another important concept is the distribution of an object's
brightness over its area. For this, we define the _surface
brightness_ to be the magnitude of an object's brightness divided
by its solid angle over the celestial sphere (or coverage in the
sky, commonly in units of arcsec$^2$). The solid angle is
expressed in units of arcsec$^2$ because astronomical targets are
usually much smaller than one steradian. Recall that the steradian
is the dimension-less SI unit of a solid angle and 1 steradian
covers $1/4\pi$ (almost $8\%$) of the full celestial sphere.
Surface brightness is therefore most commonly expressed in units of
mag/arcsec$^2$. For example, when the brightness is measured over
an area of A arcsec$^2$, then the surface brightness becomes:
$$S = -2.5\log_{10}(B/A) + Z = -2.5\log_{10}(B) + 2.5\log_{10}(A) + Z$$
In other words, the surface brightness (in units of mag/arcsec$^2$)
is related to the object's magnitude ($m$) and area ($A$, in units
of arcsec$^2$) through this equation:
$$S = m + 2.5\log_{10}(A)$$
A common mistake is to follow the mag/arcsec$^2$ unit literally,
and divide the object's magnitude by its area. But this is wrong
because magnitude is a logarithmic scale while area is linear. It
is the brightness that should be divided by the solid angle because
both have linear scales. The magnitude of that ratio is then
defined to be the surface brightness.
One usual application of this is to convert an image's pixel values
to surface brightness, when you know its zero point. This can be
done with the two simple commands below. First, we derive the
pixel area (in arcsec$^2$) then we use Arithmetic to convert the
pixels into surface brightness, see below for the details.
$ zeropoint=22.5
$ pixarea=$(astfits image.fits --pixelareaarcsec2)
$ astarithmetic image.fits $zeropoint $pixarea counts-to-sb \
--output=image-sb.fits
See *note Reverse polish notation:: for more on Arithmetic's
notation and *note Arithmetic operators:: for a description of each
operator. And see *note FITS images in a publication:: for a fully
working tutorial on how to optimally convert a FITS image to a PDF
image for usage in a publication using the surface brightness
conversion shown above.
*Do not warp or convolve magnitude or surface brightness images:*
Warping an image involves calculating new pixel values (of the new
pixel grid) from the old pixel values. Convolution is also a
process of finding the weighted mean of pixel values. During these
processes, many arithmetic operations are done on the original
pixel values, for example, addition or multiplication. However,
$log_{10}(a+b)\ne log_{10}(a)+log_{10}(b)$. Therefore after
calculating a magnitude or surface brightness image, do not apply
any such operations on it! If you need to warp or convolve the
image, do it _before_ the conversion.
---------- Footnotes ----------
(1) Counts are also known as analog to digital units (ADU).
(2) In practice, the measured pixels don't just count the
astronomical object's energy: imaging detectors insert a certain bias
level before the exposure, they amplify the photo-electrons, there are
optical artifacts like flat-fielding, and finally, there is the
background light.
(3) In practice, for very faint objects, if the background brightness
is over-subtracted, we may end up with a negative "brightness" or sum of
pixels in a real object.
(4) Comparing data from different instruments assumes instrument and
observation signatures are properly corrected, things like the
flat-field or the Sky absorption. It is also valid for pixel values,
assuming that factors that can change the morphology (like the *note
PSF::) are the same.
(5) <https://en.wikipedia.org/wiki/AB_magnitude>
(6) Recall that counts another name for ADUs, which already includes
the CCD gain.
File: gnuastro.info, Node: Quantifying measurement limits, Next: Measuring elliptical parameters, Prev: Brightness flux magnitude, Up: MakeCatalog
7.4.3 Quantifying measurement limits
------------------------------------
No measurement on a real dataset can be perfect: you can only reach a
certain level/limit of accuracy and a meaningful (scientific) analysis
requires an understanding of these limits. Different datasets have
different noise properties and different detection methods (one
method/algorithm/software that is run with a different set of parameters
is considered as a different detection method) will have different
abilities to detect or measure certain kinds of signal (astronomical
objects) and their properties in the dataset. Hence, quantifying the
detection and measurement limitations with a particular dataset and
analysis tool is the most crucial/critical aspect of any high-level
analysis. In two separate tutorials, we have touched upon some of these
points. So to see the discussions below in action (on real data), see
*note Measuring the dataset limits:: and *note Image surface brightness
limit::.
Here, we will review some of the most commonly used methods to
quantify the limits in astronomical data analysis and how MakeCatalog
makes it easy to measure them. Depending on the higher-level analysis,
there are more tests that must be done, but these are relatively
low-level and usually necessary in most cases. In astronomy, it is
common to use the magnitude (a unit-less scale) and physical units, see
*note Brightness flux magnitude::. Therefore the measurements discussed
here are commonly used in units of magnitudes.
* Menu:
* Standard deviation vs error:: The std is not a measure of the error.
* Magnitude measurement error of each detection:: Error in measuring magnitude.
* Surface brightness error of each detection:: Error in measuring the Surface brightness.
* Completeness limit of each detection:: Possibility of detecting similar objects?
* Upper limit magnitude of each detection:: How reliable is your magnitude?
* Magnitude limit of image:: Measured magnitude of objects at certain S/N.
* Surface brightness limit of image:: Extrapolate per-pixel noise-level to standard units.
* Upper limit surface brightness of image:: Measure the noise-level for a certain aperture.
File: gnuastro.info, Node: Standard deviation vs error, Next: Magnitude measurement error of each detection, Prev: Quantifying measurement limits, Up: Quantifying measurement limits
7.4.3.1 Standard deviation vs error
...................................
The error and the standard deviation are sometimes confused with each
other. Therefore, before continuing with the various measurement limits
below, let's review these two fundamental concepts. Instead of going
into the theoretical definitions of the two (which you can see in their
respective Wikipedia pages), we'll discuss the concepts in a hands-on
and practical way here.
Let's simulate an observation of the sky, but without any
astronomical sources! In other words, where we only a background flux
level (from the sky emission). With the first command below, let's make
an image called ‘1.fits’ that contains $200\times200$ pixels that are
filled with random noise from a Poisson distribution with a mean of 100
counts (the flux from the background sky). Recall that the Poisson
distribution is equal to a normal distribution for larger mean values
(as in this case).
The standard deviation ($\sigma$) of the Poisson distribution is the
square root of the mean, see *note Photon counting noise::. With the
second command, we'll have a look at the image. Note that due to the
random nature of the noise, the values reported in the next steps on
your computer will be very slightly different. To reproducible exactly
the same values in different runs, see *note Generating random
numbers::, and for more on the first command, see *note Arithmetic::.
$ astarithmetic 200 200 2 makenew 100 mknoise-poisson \
--output=1.fits
$ astscript-fits-view 1.fits
Each pixel shows the result of one sampling from the Poisson
distribution. In other words, assuming the sky emission in our
simulation is constant over our field of view, each pixel's value shows
one measurement of the sky emission. Statistically speaking, a
"measurement" is a sampling from an underlying distribution of values.
Through our measurements, we aim to identify that underlying
distribution (the "truth")! With the command below, let's look at the
pixel statistics of ‘1.fits’ (output is shown immediately under it).
$ aststatistics 1.fits
Statistics (GNU Astronomy Utilities) 0.22
-------
Input: 1.fits (hdu: 1)
-------
Number of elements: 40000
Minimum: 61
Maximum: 155
Median: 100
Mean: 100.044925
Standard deviation: 10.00066032
-------
Histogram:
| * *
| * * *
| * * * *
| * * * * *
| * * * * *
| * * ******** * *
| * ************* *
| * ****************** *
| ************************ *
| *********************************
|* ********************************************************** ** *
|----------------------------------------------------------------------
As expected, you see that the ASCII histogram nicely resembles a
normal distribution. The measured mean and standard deviation
($\sigma_x$) are also very similar to the input (mean of 100, standard
deviation of $\sigma=10$). But the measured mean (and standard
deviation) aren't exactly equal to the input!
Every time we make a different simulated image from the same
distribution, the measured mean and standard deviation will slightly
differ. With the second command below, let's build 500 images like
above and measure their mean and standard deviation. The outputs will
be written into a file (‘mean-stds.txt’; in the first command we are
deleting it to make sure we write into an empty file within the loop).
With the third command, let's view the top 10 rows:
$ rm -f mean-stds.txt
$ for i in $(seq 500); do \
astarithmetic 200 200 2 makenew 100 mknoise-poisson \
--output=$i.fits --quiet; \
aststatistics $i.fits --mean --std >> mean-stds.txt; \
echo "$i: complete"; \
done
$ asttable mean-stds.txt -Y --head=10
99.989381 9.936407
100.036622 10.059997
100.006054 9.985470
99.944535 9.960069
100.050318 9.970116
100.002718 9.905395
100.067555 9.964038
100.027167 10.018562
100.051951 9.995859
100.000212 9.970293
From this table, you see that each simulation has produced a slightly
different measured mean and measured standard deviation ($\sigma_x$)
that are just fluctuating around the input mean (which was 100) and
input standard deviation ($\sigma=10$). Let's have a look at the
distribution of mean measurements:
$ aststatistics mean-stds.txt -c1
Statistics (GNU Astronomy Utilities) 0.22
-------
Input: mean-stds.txt
Column: 1
-------
Number of elements: 500
Minimum: 9.98183528700191e+01
Maximum: 1.00146490891332e+02
Mode: 99.99709739
Mode quantile: 0.49498998
Median: 9.99977393190436e+01
Mean: 99.99891826
Standard deviation: 0.04901635275
-------
Histogram:
| *
| * **
| ****** **** * *
| ****** **** * * *
| * * ************* * *
| * ****************** **
| * ********************* *** *
| * ***************************** ***
| *** ********************************** *
| *** ******************************************* **
| * ************************************************* ** *
|----------------------------------------------------------------------
The standard deviation of the various mean measurements above shows
the scatter in measuring the mean with an image of this size from this
underlying distribution. This is therefore defined as the _standard
error of the mean_, or "error" for short (since most measurements are
actually the mean of a population) and shown with
$\widehat\sigma_{\bar{x}}$.
From the example above, you see that the error is smaller than the
standard deviation (smaller when you have a larger sample). In fact, it
can be shown (https://en.wikipedia.org/wiki/Standard_error#Derivation)
that this "error of the mean" ($\sigma_{\bar{x}}$) is related to the
distribution standard deviation ($\sigma$) through the following
equation. Where $N$ is the number of points used to measure the mean in
one sample ($200\times200=40000$ in this case). Note that the $10.0$
below was reported as "standard deviation" in the first run of
‘aststatistics’ on ‘1.fits’ above):
$$\sigma_{\bar{x}}=\frac{\sigma}{\sqrt{N}} \quad\quad {\rm or} \quad\quad \widehat\sigma_{\bar{x}}\approx\frac{\sigma_x}{\sqrt{N}} = \frac{10.0}{200} = 0.05$$
Taking the considerations above into account, we should clearly
distinguish the following concepts when talking about the standard
deviation or error:
Standard deviation of population
This is the standard deviation of the underlying distribution (10
in the example above), and shown by $\sigma$. This is something
you can never measure, and is just the ideal value.
Standard deviation of mean
Ideal error of measuring the mean (assuming we know $\sigma$).
Standard deviation of sample (i.e., _Standard deviation_)
Measured Standard deviation from a sampling of the ideal
distribution. This is the second column of ‘mean-stds.txt’ above
and is shown with $\sigma_x$ above. In astronomical literature,
this is simply referred to as the "standard deviation".
In other words, the standard deviation is computed on the input
itself and MakeCatalog just needs a "values" file. For example,
when measuring the standard deviation of an astronomical object
using MakeCatalog it is computed directly from the input values.
Standard error (i.e., _error_)
Measurable scatter of measuring the mean
($\widehat\sigma_{\bar{x}}$) that can be estimated from the size of
the sample and the measured standard deviation ($\sigma_x$). In
astronomical literature, this is simply referred to as the "error".
In other words, when asking for an "error" measurement with
MakeCatalog, a separate standard deviation dataset should be always
provided. This dataset should take into account all sources of
scatter. For example, during the reduction of an image, the
standard deviation dataset should take into account the dispersion
of each pixel that comes from the bias, dark, flat fielding, etc.
If this image is not available, it is possible to use the ‘SKY_STD’
extension from NoiseChisel as an estimation. For more see *note
NoiseChisel output::.
File: gnuastro.info, Node: Magnitude measurement error of each detection, Next: Surface brightness error of each detection, Prev: Standard deviation vs error, Up: Quantifying measurement limits
7.4.3.2 Magnitude measurement error of each detection
.....................................................
The raw error in measuring the magnitude is only meaningful when the
object's magnitude is brighter than the upper-limit magnitude (see
below). As discussed in *note Brightness flux magnitude::, the
magnitude ($M$) of an object with brightness $B$ and zero point
magnitude $z$ can be written as:
$$M=-2.5\log_{10}(B)+z$$
Calculating the derivative with respect to $B$, we get:
$${dM\over dB} = {-2.5\over {B\times ln(10)}}$$
From the Tailor series ($\Delta{M}=dM/dB\times\Delta{B}$), we can write:
$$\Delta{M} = \left|{-2.5\over ln(10)}\right|\times{\Delta{B}\over{B}}$$
But, $\Delta{B}/B$ is just the inverse of the Signal-to-noise ratio
($S/N$), so we can write the error in magnitude in terms of the
signal-to-noise ratio:
$$\Delta{M} = {2.5\over{S/N\times ln(10)}} $$
MakeCatalog uses this relation to estimate the magnitude errors. The
signal-to-noise ratio is calculated in different ways for clumps and
objects, see Akhlaghi and Ichikawa 2015
(https://arxiv.org/abs/1505.01664)), but this single equation can be
used to estimate the measured magnitude error afterwards for any type of
target.
File: gnuastro.info, Node: Surface brightness error of each detection, Next: Completeness limit of each detection, Prev: Magnitude measurement error of each detection, Up: Quantifying measurement limits
7.4.3.3 Surface brightness error of each detection
..................................................
We can derive the error in measuring the surface brightness based on the
surface brightness (SB) equation of *note Brightness flux magnitude::
and the generic magnitude error ($\Delta{M}$) of *note Magnitude
measurement error of each detection::. Let's set $A$ to represent the
area and $\Delta{A}$ to represent the error in measuring the area. For
more on $\Delta{A}$, see the description of ‘--spatialresolution’ in
*note MakeCatalog inputs and basic settings::.
$$\Delta{(SB)} = \Delta{M} + \left|{-2.5\over ln(10)}\right|\times{\Delta{A}\over{A}}$$
In the surface brightness equation mentioned above, $A$ is in units
of arcsecond squared and the conversion between arcseconds to pixels is
a multiplication factor. Therefore as long as $A$ and $\Delta{A}$ have
the same units, it does not matter if they are in arcseconds or pixels.
Since the measure of spatial resolution (or area error) is the FWHM of
the PSF which is usually defined in terms of pixels, its more intuitive
to use pixels for $A$ and $\Delta{A}$.
File: gnuastro.info, Node: Completeness limit of each detection, Next: Upper limit magnitude of each detection, Prev: Surface brightness error of each detection, Up: Quantifying measurement limits
7.4.3.4 Completeness limit of each detection
............................................
As the surface brightness of the objects decreases, the ability to
detect them will also decrease. An important statistic is thus the
fraction of objects of similar morphology and magnitude that will be
detected with our detection algorithm/parameters in a given image. This
fraction is known as _completeness_. For brighter objects, completeness
is 1: all bright objects that might exist over the image will be
detected. However, as we go to objects of lower overall surface
brightness, we will fail to detect a fraction of them, and fainter than
a certain surface brightness level (for each morphology),nothing will be
detectable in the image: you will need more data to construct a "deeper"
image. For a given profile and dataset, the magnitude where the
completeness drops below a certain level (usually above $90\%$) is known
as the completeness limit.
Another important parameter in measuring completeness is purity: the
fraction of true detections to all detections. In effect purity is the
measure of contamination by false detections: the higher the purity, the
lower the contamination. Completeness and purity are anti-correlated:
if we can allow a large number of false detections (that we might be
able to remove by other means), we can significantly increase the
completeness limit.
One traditional way to measure the completeness and purity of a given
sample is by embedding mock profiles in regions of the image with no
detection. However in such a study we must be really careful to choose
model profiles as similar to the target of interest as possible.
File: gnuastro.info, Node: Upper limit magnitude of each detection, Next: Magnitude limit of image, Prev: Completeness limit of each detection, Up: Quantifying measurement limits
7.4.3.5 Upper limit magnitude of each detection
...............................................
Due to the noisy nature of data, it is possible to get arbitrarily faint
magnitudes, especially when you use labels from another image (for
example see *note Working with catalogs estimating colors::). Given the
scatter caused by the dataset's noise, values fainter than a certain
level are meaningless: another similar depth observation will give a
radically different value. In such cases, measurements like the image
magnitude limit are not useful because it is estimated for a certain
morphology and is given for the whole image (it is a crude
generalization; see see *note Magnitude limit of image::). You want a
quality measure that is specific to each object.
For example, assume that you have done your detection and
segmentation on one filter and now you do measurements over the same
labeled regions, but on other filters to measure colors (as we did in
the tutorial *note Segmentation and making a catalog::). Some objects
are not going to have any significant signal in the other filters, but
for example, you measure magnitude of 36 for one of them! This is
clearly unreliable (no dataset in current astronomy is able to detect
such a faint signal). In another image with the same depth, using the
same filter, you might measure a magnitude of 30 for it, and yet another
might give you 33. Furthermore, the total sum of pixel values might
actually be negative in some images of the same depth (due to noise).
In these cases, no magnitude can be defined and MakeCatalog will place a
NaN there (recall that a magnitude is a base-10 logarithm).
Using such unreliable measurements will directly affect our analysis,
so we must not use the raw measurements. When approaching the limits of
your detection method, it is therefore important to be able to identify
such cases. But how can we know how reliable a measurement of one
object on a given dataset is?
When we confront such unreasonably faint magnitudes, there is one
thing we can deduce: that if something actually exists under our labeled
pixels (possibly buried deep under the noise), it's inherent magnitude
is fainter than an _upper limit magnitude_. To find this upper limit
magnitude, we place the object's footprint (segmentation map) over a
random part of the image where there are no detections, and measure the
sum of pixel values within the footprint. Doing this a large number of
times will give us a distribution of measurements of the sum. The
standard deviation ($\sigma$) of that distribution can be used to
quantify the upper limit magnitude for that particular object (given its
particular shape and area):
$$M_{up,n\sigma}=-2.5\times\log_{10}{(n\sigma_m)}+z \quad\quad [mag/target]$$
Traditionally, faint/small object photometry was done using fixed
circular apertures (for example, with a diameter of $N$ arc-seconds) and
there was not much processing involved (to make a deep stack). Hence,
the upper limit was synonymous with the surface brightness limit
discussed above: one value for the whole image. The problem with this
simplified approach is that the number of pixels in the aperture
directly affects the final distribution and thus magnitude. Also the
image correlated noise might actually create certain patterns, so the
shape of the object can also affect the final result. Fortunately, with
the much more advanced hardware and software of today, we can make
customized segmentation maps (footprint) for each object and have enough
computing power to actually place that footprint over many random
places. As a result, the per-target upper-limit magnitude and general
surface brightness limit have diverged.
When any of the upper-limit-related columns requested, MakeCatalog
will randomly place each target's footprint over the undetected parts of
the dataset as described above, and estimate the required properties.
The procedure is fully configurable with the options in *note
Upper-limit settings::. You can get the full list of upper-limit
related columns of MakeCatalog with this command (the extra ‘--’ before
‘--upperlimit’ is necessary(1)):
$ astmkcatalog --help | grep -- --upperlimit
---------- Footnotes ----------
(1) Without the extra ‘--’, grep will assume that ‘--upperlimit’ is
one of its own options, and will thus abort, complaining that it has no
option with this name.
File: gnuastro.info, Node: Magnitude limit of image, Next: Surface brightness limit of image, Prev: Upper limit magnitude of each detection, Up: Quantifying measurement limits
7.4.3.6 Magnitude limit of image
................................
Suppose we have taken two images of the same field of view with the same
CCD, once with a smaller telescope, and once with a larger one. Because
we used the same CCD, the noise will be very similar. However, the
larger telescope has gathered more light, therefore the same star or
galaxy will have a higher signal-to-noise ratio (S/N) in the image taken
with the larger one. The same applies for a stacked image of the field
compared to a single-exposure image of the same telescope.
This concept is used by some researchers to define the "magnitude
limit" or "detection limit" at a certain S/N (sometimes 10, 5 or 3 for
example, also written as $10\sigma$, $5\sigma$ or $3\sigma$). To do
this, they measure the magnitude and signal-to-noise ratio of all the
objects within an image and measure the mean (or median) magnitude of
objects at the desired S/N. A fully working example of deriving the
magnitude limit is available in the tutorials section: *note Measuring
the dataset limits::.
However, this method should be used with extreme care! This is
because the shape of the object becomes important in this method: a
sharper object will have a higher _measured_ S/N compared to a more
diffuse object at the same original magnitude. Besides the inherent
shape/sharpness of the object, issues like the PSF also become important
in this method (because the finally observed shapes of objects are
important here): two surveys with the same surface brightness limit (see
*note Surface brightness limit of image::) will have different magnitude
limits if one is taken from space and the other from the ground.
File: gnuastro.info, Node: Surface brightness limit of image, Next: Upper limit surface brightness of image, Prev: Magnitude limit of image, Up: Quantifying measurement limits
7.4.3.7 Surface brightness limit of image
.........................................
As we make more observations on one region of the sky and add/combine
the observations into one dataset, both the signal and the noise
increase. However, the signal increases much faster than the noise:
Assuming you add $N$ datasets with equal exposure times, the signal will
increases as a multiple of $N$, while noise increases as $\sqrt{N}$.
Therefore the signal-to-noise ratio increases by a factor of $\sqrt{N}$.
Visually, fainter (per pixel) parts of the objects/signal in the image
will become more visible/detectable. The noise-level is known as the
dataset's surface brightness limit.
You can think of the noise as muddy water that is completely covering
a flat ground(1). The signal (coming from astronomical objects in real
data) will be summits/hills that start from the flat sky level (under
the muddy water) and their summits can sometimes reach above the muddy
water. Let's assume that in your first observation the muddy water has
just been stirred and except a few small peaks, you cannot see anything
through the mud. As you wait and make more observations/exposures, the
mud settles down and the _depth_ of the transparent water increases. As
a result, more and more summits become visible and the lower parts of
the hills (parts with lower surface brightness) can be seen more
clearly. In this analogy(2), height (from the ground) is the _surface
brightness_ and the height of the muddy water at the moment you combine
your data, is your _surface brightness limit_ for that moment.
The outputs of NoiseChisel include the Sky standard deviation
($\sigma$) on every group of pixels (a tile) that were calculated from
the undetected pixels in each tile, see *note Tessellation:: and *note
NoiseChisel output::. Let's take $\sigma_m$ as the median $\sigma$ over
the successful meshes in the image (prior to interpolation or
smoothing). It is recorded in the ‘MEDSTD’ keyword of the ‘SKY_STD’
extension of NoiseChisel's output.
On different instruments, pixels cover different spatial angles over
the sky. For example, the width of each pixel on the ACS camera on the
Hubble Space Telescope (HST) is roughly 0.05 seconds of arc, while the
pixels of SDSS are each 0.396 seconds of arc (almost eight times
wider(3)). Nevertheless, irrespective of its sky coverage, a pixel is
our unit of data collection.
To start with, we define the low-level Surface brightness limit or
_depth_, in units of magnitude/pixel with the equation below (assuming
the image has zero point magnitude $z$ and we want the $n$th multiple of
$\sigma_m$).
$$SB_{n\sigma,\rm pixel}=-2.5\times\log_{10}{(n\sigma_m)}+z \quad\quad [mag/pixel]$$
As an example, the XDF survey covers part of the sky that the HST has
observed the most (for 85 orbits) and is consequently very small
($\sim4$ minutes of arc, squared). On the other hand, the CANDELS
survey, is one of the widest multi-color surveys done by the HST
covering several fields (about 720 arcmin$^2$) but its deepest fields
have only 9 orbits observation. The $1\sigma$ depth of the XDF and
CANDELS-deep surveys in the near infrared WFC3/F160W filter are
respectively 34.40 and 32.45 magnitudes/pixel. In a single orbit image,
this same field has a $1\sigma$ depth of 31.32 magnitudes/pixel. Recall
that a larger magnitude corresponds to fainter objects, see *note
Brightness flux magnitude::.
The low-level magnitude/pixel measurement above is only useful when
all the datasets you want to use, or compare, have the same pixel size.
However, you will often find yourself using, or comparing, datasets from
various instruments with different pixel scales (projected pixel width,
in arc-seconds). If we know the pixel scale, we can obtain a more
easily comparable surface brightness limit in units of:
magnitude/arcsec$^2$. But another complication is that astronomical
objects are usually larger than 1 arcsec$^2$. As a result, it is common
to measure the surface brightness limit over a larger (but fixed,
depending on context) area.
Let's assume that every pixel is $p$ arcsec$^2$ and we want the
surface brightness limit for an object covering A arcsec$^2$ (so $A/p$
is the number of pixels that cover an area of $A$ arcsec$^2$). On the
other hand, noise is added in RMS(4), hence the noise level in $A$
arcsec$^2$ is $n\sigma_m\sqrt{A/p}$. But we want the result in units of
arcsec$^2$, so we should divide this by $A$ arcsec$^2$:
$n\sigma_m\sqrt{A/p}/A=n\sigma_m\sqrt{A/(pA^2)}=n\sigma_m/\sqrt{pA}$.
Plugging this into the magnitude equation, we get the $n\sigma$ surface
brightness limit, over an area of A arcsec$^2$, in units of
magnitudes/arcsec$^2$:
$$SB_{{n\sigma,\rm A arcsec}^2}=-2.5\times\log_{10}{\left(n\sigma_m\over \sqrt{pA}\right)+z} \quad\quad [mag/arcsec^2]$$
MakeCatalog will calculate the input dataset's $SB_{n\sigma,\rm
pixel}$ and $SB_{{n\sigma,\rm A arcsec}^2}$ and will write them as the
‘SBLMAGPIX’ and ‘SBLMAG’ keywords the output catalog(s), see *note
MakeCatalog output::. You can set your desired $n$-th multiple of
$\sigma$ and the $A$ arcsec$^2$ area using the following two options
respectively: ‘--sfmagnsigma’ and ‘--sfmagarea’ (see *note MakeCatalog
output::). Just note that $SB_{{n\sigma,\rm A arcsec}^2}$ is only
calculated if the input has World Coordinate System (WCS). Without WCS,
the pixel scale cannot be derived.
As you saw in its derivation, the calculation above extrapolates the
noise in one pixel over all the input's pixels! In other words, all
pixels are treated independently in the measurement of the standard
deviation. It therefore implicitly assumes that the noise is the same
in all of the pixels. But this only happens in individual exposures:
reduced data will have correlated noise because they are a stack of many
individual exposures that have been warped (thus mixing the pixel
values). A more accurate measure which will provide a realistic value
for every labeled region is known as the _upper-limit magnitude_, which
is discussed in the next section (*note Upper limit surface brightness
of image::).
---------- Footnotes ----------
(1) The ground is the sky value in this analogy, see *note Sky
value::. Note that this analogy only holds for a flat sky value across
the surface of the image or ground.
(2) Note that this muddy water analogy is not perfect, because while
the water-level remains the same all over a peak, in data analysis, the
Poisson noise increases with the level of data.
(3) Ground-based instruments like the SDSS suffer from strong
smoothing due to the atmosphere. Therefore, increasing the pixel
resolution (or decreasing the width of a pixel) will not increase the
received information).
(4) If you add three datasets with noise $\sigma_1$, $\sigma_2$ and
$\sigma_3$, the resulting noise level is
$\sigma_t=\sqrt{\sigma_1^2+\sigma_2^2+\sigma_3^2}$, so when
$\sigma_1=\sigma_2=\sigma_3\equiv\sigma$, then
$\sigma_t=\sigma\sqrt{3}$. In this case, the area $A$ is covered by
$A/p$ pixels, so the noise level is $\sigma_t=\sigma\sqrt{A/p}$.
File: gnuastro.info, Node: Upper limit surface brightness of image, Prev: Surface brightness limit of image, Up: Quantifying measurement limits
7.4.3.8 Upper limit surface brightness of image
...............................................
As mentioned in *note Surface brightness limit of image::, the surface
brightness limit assumes independent pixels when deriving the standard
deviation (the main input in the equation). It just extrapolates the
standard devaiation derived from one pixel to the requested area. But
as mentioned at the end of that section, we have correlated noise in our
science-ready (deep) images and the noise of the pixels are not
independent.
Because of this, the surface brightness limit will always
under-estimate the surface brightness (give fainter values than what is
statistically possible in the data for the requested area). To account
for the correlated noise in the images, we need to derive the standard
deviation over a group of pixels that fall within a certain
footprint/shape. For example over a circular aperture of radius 5.6419
arcsec, or a square with a side length of $10$ arcsec. Depending on the
correlated noise systematics, the limit can be (very) different for
different shapes, even if they have the same area (as in the circle and
square mentioned in the previous sentence: both have an area of 100
arcsec$^2$).
Therefore we need to derive the standard deviation that goes into the
surface brightness limit equation over a certain footprint/shape. To do
that, we should:
1. Place the desired footprint many times randomly over all the
undetected pixels in an image. In MakeCatalog, the number of these
random positions can be configured with ‘--upnum’ and you can check
their positions with ‘--checkuplim’.
2. Calculate the sum of pixel values in each randomly placed
footprint.
3. Calculate the sigma-clipped standard deviation of the resulting
distribution (of sum of pixel values in the randomly placed
apertures). Therefore, each footprint's measurement is be
independent of the other.
4. Calculate the surface brightness of that standrad deviation (after
multiplying it with your desired multiple of sigma). For the
definition of surface brightness, see *note Brightness flux
magnitude::.
If you have reviewed the previous sections, the measurements over
randomly placed apertures should remind you of *note Upper limit
magnitude of each detection::. Generally, the "upper limit" prefix is
given to all measurements with this way of measurement. Therefore this
limit is called "Upper limit surface brightness" of an image (for a
multiple of sigma, over a certain shape).
Traditionally a circular aperture of a fixed radius (in arcseconds)
has been used. In Gnuastro, a labeled image containing the desired
shape/aperture can be generated with MakeProfiles. The position of the
label is irrelevant because the upper limit measurements are done on the
many randomly placed footprints in undetected regions (independent of
where the label is positioned). That labeled image should then be given
to MakeCatalog, while requesting ‘--upperlimit-sb’. Of course, all
detected signal in the image needs to be masked (set to blank/NaN) so
MakeCatalog doesn't use randomly placed apertures that overlap with
detected signal in the image.
Going into the implementation details can get pretty hard to follow
in English, so a full hands-on tutorial is available in the second half
of *note Image surface brightness limit::. Read that tutorial with the
same input images and run the commands, and see each output image to get
a good understanding of how to properly measure the upper limit surface
brightness of your images.
File: gnuastro.info, Node: Measuring elliptical parameters, Next: Adding new columns to MakeCatalog, Prev: Quantifying measurement limits, Up: MakeCatalog
7.4.4 Measuring elliptical parameters
-------------------------------------
The shape or morphology of a target is one of the most commonly desired
parameters of a target. Here, we will review the derivation of the most
basic/simple morphological parameters: the elliptical parameters for a
set of labeled pixels. The elliptical parameters are: the (semi-)major
axis, the (semi-)minor axis and the position angle along with the
central position of the profile. The derivations below follow the
SExtractor manual derivations with some added explanations for easier
reading.
Let's begin with one dimension for simplicity: Assume we have a set
of $N$ values $B_i$ (for example, showing the spatial distribution of a
target's brightness), each at position $x_i$. The simplest parameter we
can define is the geometric center of the object ($x_g$) (ignoring the
brightness values): $x_g=(\sum_ix_i)/N$. _Moments_ are defined to
incorporate both the value (brightness) and position of the data. The
first moment can be written as:
$$\overline{x}={\sum_iB_ix_i \over \sum_iB_i}$$
This is essentially the weighted (by $B_i$) mean position. The
geometric center ($x_g$, defined above) is a special case of this with
all $B_i=1$. The second moment is essentially the variance of the
distribution:
$$\overline{x^2}\equiv{\sum_iB_i(x_i-\overline{x})^2 \over
\sum_iB_i} = {\sum_iB_ix_i^2 \over \sum_iB_i} -
2\overline{x}{\sum_iB_ix_i\over\sum_iB_i} + \overline{x}^2
={\sum_iB_ix_i^2 \over \sum_iB_i} - \overline{x}^2$$
The last step was done from the definition of $\overline{x}$. Hence,
the square root of $\overline{x^2}$ is the spatial standard deviation
(along the one-dimension) of this particular brightness distribution
($B_i$). Crudely (or qualitatively), you can think of its square root
as the distance (from $\overline{x}$) which contains a specific amount
of the flux (depending on the $B_i$ distribution). Similar to the first
moment, the geometric second moment can be found by setting all $B_i=1$.
So while the first moment quantified the position of the brightness
distribution, the second moment quantifies how that brightness is
dispersed about the first moment. In other words, it quantifies how
"sharp" the object's image is.
Before continuing to two dimensions and the derivation of the
elliptical parameters, let's pause for an important implementation
technicality. You can ignore this paragraph and the next two if you do
not want to implement these concepts. The basic definition (first
definition of $\overline{x^2}$ above) can be used without any major
problem. However, using this fraction requires two runs over the data:
one run to find $\overline{x}$ and another run to find $\overline{x^2}$
from $\overline{x}$, this can be slow. The advantage of the last
fraction above, is that we can estimate both the first and second
moments in one run (since the $-\overline{x}^2$ term can easily be added
later).
The logarithmic nature of floating point number digitization creates
a complication however: suppose the object is located between pixels
10000 and 10020. Hence the target's pixels are only distributed over 20
pixels (with a standard deviation $<20$), while the mean has a value of
$\sim10000$. The $\sum_iB_i^2x_i^2$ will go to very very large values
while the individual pixel differences will be orders of magnitude
smaller. This will lower the accuracy of our calculation due to the
limited accuracy of floating point operations. The variance only
depends on the distance of each point from the mean, so we can shift all
position by a constant/arbitrary $K$ which is much closer to the mean:
$\overline{x-K}=\overline{x}-K$. Hence we can calculate the second
order moment using:
$$\overline{x^2}={\sum_iB_i(x_i-K)^2 \over \sum_iB_i} -
(\overline{x}-K)^2 $$
The closer $K$ is to $\overline{x}$, the better (the sums of squares
will involve smaller numbers), as long as $K$ is within the object
limits (in the example above: $10000\leq{K}\leq10020$), the floating
point error induced in our calculation will be negligible. For the most
simplest implementation, MakeCatalog takes $K$ to be the smallest
position of the object in each dimension. Since $K$ is arbitrary and an
implementation/technical detail, we will ignore it for the remainder of
this discussion.
In two dimensions, the mean and variances can be written as:
$$\overline{x}={\sum_iB_ix_i\over B_i}, \quad
\overline{x^2}={\sum_iB_ix_i^2 \over \sum_iB_i} -
\overline{x}^2$$
$$\overline{y}={\sum_iB_iy_i\over B_i}, \quad
\overline{y^2}={\sum_iB_iy_i^2 \over \sum_iB_i} -
\overline{y}^2$$
$$\quad\quad\quad\quad\quad\quad\quad\quad\quad
\overline{xy}={\sum_iB_ix_iy_i \over \sum_iB_i} -
\overline{x}\times\overline{y}$$
If an elliptical profile's major axis exactly lies along the $x$
axis, then $\overline{x^2}$ will be directly proportional with the
profile's major axis, $\overline{y^2}$ with its minor axis and
$\overline{xy}=0$. However, in reality we are not that lucky and
(assuming galaxies can be parameterized as an ellipse) the major axis of
galaxies can be in any direction on the image (in fact this is one of
the core principles behind weak-lensing by shear estimation). So the
purpose of the remainder of this section is to define a strategy to
measure the position angle and axis ratio of some randomly positioned
ellipses in an image, using the raw second moments that we have
calculated above in our image coordinates.
Let's assume we have rotated the galaxy by $\theta$, the new second
order moments are:
$$\overline{x_\theta^2} = \overline{x^2}\cos^2\theta +
\overline{y^2}\sin^2\theta -
2\overline{xy}\cos\theta\sin\theta $$
$$\overline{y_\theta^2} = \overline{x^2}\sin^2\theta +
\overline{y^2}\cos^2\theta +
2\overline{xy}\cos\theta\sin\theta$$
$$\overline{xy_\theta} = \overline{x^2}\cos\theta\sin\theta -
\overline{y^2}\cos\theta\sin\theta +
\overline{xy}(\cos^2\theta-\sin^2\theta)$$
The best $\theta$ ($\theta_0$, where major axis lies along the
$x_\theta$ axis) can be found by:
$$\left.{\partial \overline{x_\theta^2} \over \partial \theta}\right|_{\theta_0}=0$$
Taking the derivative, we get:
$$2\cos\theta_0\sin\theta_0(\overline{y^2}-\overline{x^2}) +
2(\cos^2\theta_0-\sin^2\theta_0)\overline{xy}=0$$ When
$\overline{x^2}\neq\overline{y^2}$, we can write:
$$\tan2\theta_0 =
2{\overline{xy} \over \overline{x^2}-\overline{y^2}}.$$
MakeCatalog uses the standard C math library's ‘atan2’ function to
estimate $\theta_0$, which we define as the position angle of the
ellipse. To recall, this is the angle of the major axis of the ellipse
with the $x$ axis. By definition, when the elliptical profile is
rotated by $\theta_0$, then $\overline{xy_{\theta_0}}=0$,
$\overline{x_{\theta_0}^2}$ will be the extent of the maximum variance
and $\overline{y_{\theta_0}^2}$ the extent of the minimum variance
(which are perpendicular for an ellipse). Replacing $\theta_0$ in the
equations above for $\overline{x_\theta}$ and $\overline{y_\theta}$, we
can get the semi-major ($A$) and semi-minor ($B$) lengths:
$$A^2\equiv\overline{x_{\theta_0}^2}= {\overline{x^2} +
\overline{y^2} \over 2} + \sqrt{\left({\overline{x^2}-\overline{y^2} \over 2}\right)^2 + \overline{xy}^2}$$
$$B^2\equiv\overline{y_{\theta_0}^2}= {\overline{x^2} +
\overline{y^2} \over 2} - \sqrt{\left({\overline{x^2}-\overline{y^2} \over 2}\right)^2 + \overline{xy}^2}$$
As a summary, it is important to remember that the units of $A$ and
$B$ are in pixels (the standard deviation of a positional distribution)
and that they represent the spatial light distribution of the object in
both image dimensions (rotated by $\theta_0$). When the object cannot
be represented as an ellipse, this interpretation breaks down:
$\overline{xy_{\theta_0}}\neq0$ and $\overline{y_{\theta_0}^2}$ will not
be the direction of minimum variance.
File: gnuastro.info, Node: Adding new columns to MakeCatalog, Next: MakeCatalog measurements, Prev: Measuring elliptical parameters, Up: MakeCatalog
7.4.5 Adding new columns to MakeCatalog
---------------------------------------
MakeCatalog is designed to allow easy addition of different measurements
over a labeled image; see Akhlaghi 2016
(https://arxiv.org/abs/1611.06387v1). A check-list style description of
necessary steps to do that is described in this section. The common
development characteristics of MakeCatalog and other Gnuastro programs
is explained in *note Developing::. We strongly encourage you to have a
look at that chapter to greatly simplify your navigation in the code.
After adding and testing your column, you are most welcome (and
encouraged) to share it with us so we can add to the next release of
Gnuastro for everyone else to also benefit from your efforts.
MakeCatalog will first pass over each label's pixels two times and do
necessary raw/internal calculations. Once the passes are done, it will
use the raw information for filling the final catalog's columns. In the
first pass it will gather mainly object information and in the second
run, it will mainly focus on the clumps, or any other measurement that
needs an output from the first pass. These two passes are designed to
be raw summations: no extra processing. This will allow parallel
processing and simplicity/clarity. So if your new calculation, needs
new raw information from the pixels, then you will need to also modify
the respective ‘mkcatalog_first_pass’ and ‘mkcatalog_second_pass’
functions (both in ‘bin/mkcatalog/mkcatalog.c’) and define new raw table
columns in ‘main.h’ (hopefully the comments in the code are clear
enough).
In all these different places, the final columns are sorted in the
same order (same order as *note Invoking astmkcatalog::). This allows a
particular column/option to be easily found in all steps. Therefore in
adding your new option, be sure to keep it in the same relative place in
the list in all the separate places (it does not necessarily have to be
in the end), and near conceptually similar options.
‘main.h’
The ‘objectcols’ and ‘clumpcols’ enumerated variables (‘enum’)
define the raw/internal calculation columns. If your new column
requires new raw calculations, add a row to the respective list.
If your calculation requires any other settings parameters, you
should add a variable to the ‘mkcatalogparams’ structure.
‘ui.c’
If the new column needs raw calculations (an entry was added in
‘objectcols’ and ‘clumpcols’), specify which inputs it needs in
‘ui_necessary_inputs’, similar to the other options. Afterwards,
if your column includes any particular settings (you needed to add
a variable to the ‘mkcatalogparams’ structure in ‘main.h’), you
should do the sanity checks and preparations for it here.
‘ui.h’
The ‘option_keys_enum’ associates a unique value for each option to
MakeCatalog. The options that have a short option version, the
single character short comment is used for the value. Those that
do not have a short option version, get a large integer
automatically. You should add a variable here to identify your
desired column.
‘args.h’
This file specifies all the parameters for the GNU C library, Argp
structure that is in charge of reading the user's options. To
define your new column, just copy an existing set of parameters and
change the first, second and 5th values (the only ones that differ
between all the columns), you should use the macro you defined in
‘ui.h’ here.
‘columns.c’
This file contains the main definition and high-level calculation
of your new column through the ‘columns_define_alloc’ and
‘columns_fill’ functions. In the first, you specify the basic
information about the column: its name, units, comments, type (see
*note Numeric data types::) and how it should be printed if the
output is a text file. You should also specify the raw/internal
columns that are necessary for this column here as the many
existing examples show. Through the types for objects and rows,
you can specify if this column is only for clumps, objects or both.
The second main function (‘columns_fill’) writes the final value
into the appropriate column for each object and clump. As you can
see in the many existing examples, you can define your processing
on the raw/internal calculations here and save them in the output.
‘mkcatalog.c’
This file contains the low-level parsing functions. To be
optimized, the parsing is done in parallel through the
‘mkcatalog_single_object’ function. This function initializes the
necessary arrays and calls the lower-level ‘parse_objects’ and
‘parse_clumps’ for actually going over the pixels. They are all
heavily commented, so you should be able to follow where to add
your necessary low-level calculations.
‘doc/gnuastro.texi’
Update this manual and add a description for the new column.
File: gnuastro.info, Node: MakeCatalog measurements, Next: Invoking astmkcatalog, Prev: Adding new columns to MakeCatalog, Up: MakeCatalog
7.4.6 MakeCatalog measurements
------------------------------
MakeCatalog's output measurements/columns can be specified using
command-line options (*note Options::). The current measurements in
MakeCatalog are those which only produce one final value for each label
(for example, its magnitude: a single number). All the different
label's measurements can be written as one column in a final
table/catalog that contains other columns for other similar
single-number measurements.
In this case, all the different label's measurements can be written
as one column in a final table/catalog that contains other columns for
other similar single-number measurements. The majority of this section
is devoted to MakeCatalog's single-valued measurements. However,
MakeCatalog can also do measurements that produce more than one value
for each label. Currently the only such measurement is generation of
spectra from 3D cubes with the ‘--spectrum’ option and it is discussed
in the end of this section.
Command-line options are used to identify which measurements you want
in the final catalog(s) and in what order. If any of the options below
is called on the command-line or in any of the configuration files, it
will be included as a column in the output catalog. The order of the
columns is in the same order as the options were seen by MakeCatalog
(see *note Configuration file precedence::). Some of the columns apply
to both "objects" and "clumps" and some are particular to only one of
them (for the definition of "objects" and "clumps", see *note
Segment::). Columns/options that are unique to one catalog (only
objects, or only clumps), are explicitly marked with [Objects] or
[Clumps] to specify the catalog they will be placed in.
* Menu:
* Identifier columns:: Identifying labels of each row (object/clumps).
* Position measurements in pixels:: Containing image/pixel (X/Y) measurements.
* Position measurements in WCS:: Containing WCS (for example RA/Dec) measurements.
* Brightness measurements:: Using pixel values of each label.
* Surface brightness measurements:: Various ways to measure surface brightness.
* Morphology measurements nonparametric:: Non-parametric morphology.
* Morphology measurements elliptical:: Elliptical morphology measurements.
* Measurements per slice spectra:: Measurements on each slice (like spectra).
File: gnuastro.info, Node: Identifier columns, Next: Position measurements in pixels, Prev: MakeCatalog measurements, Up: MakeCatalog measurements
7.4.6.1 Identifier columns
..........................
The identifier of each row (group of measurements) is usually the first
thing you will be requesting from MakeCatalog. Without the identifier,
it is not clear which measurement corresponds to which label for the
input.
Since MakeCatalog can also optionally take sub-structure label
(clumps; see *note Segment::), there are various identifiers in general
that are listed below. The most generic (and shortest and easiest to
type!) is the ‘--ids’ option which can be used in object-only or
object-clump catalogs.
‘--i’
‘--ids’
This is a unique option which can add multiple columns to the final
catalog(s). Calling this option will put the object IDs
(‘--obj-id’) in the objects catalog and host-object-ID
(‘--host-obj-id’) and ID-in-host-object (‘--id-in-host-obj’) into
the clumps catalog. Hence if only object catalogs are required, it
has the same effect as ‘--obj-id’.
‘--obj-id’
[Objects] ID of this object.
‘-j’
‘--host-obj-id’
[Clumps] The ID of the object which hosts this clump.
‘--id-in-host-obj’
[Clumps] The ID of this clump in its host object.
File: gnuastro.info, Node: Position measurements in pixels, Next: Position measurements in WCS, Prev: Identifier columns, Up: MakeCatalog measurements
7.4.6.2 Position measurements in pixels
.......................................
The position of a labeled region within your input dataset (in its own
units) can be measured with the options in this section. By "in its own
units" we mean pixels in a 2D image or voxels in a 3D cube. For example
if the flux-weighted center of a label lies 123 pixels on the horizontal
and 456 pixels on the vertical, the ‘--x’ and ‘--y’ options will put a
value of 123 and 456 in their respective columns. As you see below,
there are various ways to define the "position" of an object, so read
the differences carefully to choose the one that corresponds best to
your usage.
‘-x’
‘--x’
The flux weighted center of all objects and clumps along the first
FITS axis (horizontal when viewed in SAO DS9), see $\overline{x}$
in *note Measuring elliptical parameters::. The weight has to have
a positive value (pixel value larger than the Sky value) to be
meaningful! Specially when doing matched photometry, this might
not happen: no pixel value might be above the Sky value. For such
detections, the geometric center will be reported in this column
(see ‘--geo-x’). You can use ‘--weight-area’ to see which was
used.
‘-y’
‘--y’
The flux weighted center of all objects and clumps along the second
FITS axis (vertical when viewed in SAO DS9). See ‘--x’.
‘-z’
‘--z’
The flux weighted center of all objects and clumps along the third
FITS axis. See ‘--x’.
‘--geo-x’
The geometric center of all objects and clumps along the first FITS
axis axis. The geometric center is the average pixel positions
irrespective of their pixel values.
‘--geo-y’
The geometric center of all objects and clumps along the second
FITS axis axis, see ‘--geo-x’.
‘--geo-z’
The geometric center of all objects and clumps along the third FITS
axis axis, see ‘--geo-x’.
‘--min-val-x’
Position of pixel with minimum value in objects and clumps, along
the first FITS axis.
‘--max-val-x’
Position of pixel with maximum value in objects and clumps, along
the first FITS axis.
‘--min-val-y’
Position of pixel with minimum value in objects and clumps, along
the first FITS axis.
‘--max-val-y’
Position of pixel with maximum value in objects and clumps, along
the first FITS axis.
‘--min-val-z’
Position of pixel with minimum value in objects and clumps, along
the first FITS axis.
‘--max-val-z’
Position of pixel with maximum value in objects and clumps, along
the first FITS axis.
‘--min-x’
The minimum position of all objects and clumps along the first FITS
axis.
‘--max-x’
The maximum position of all objects and clumps along the first FITS
axis.
‘--min-y’
The minimum position of all objects and clumps along the second
FITS axis.
‘--max-y’
The maximum position of all objects and clumps along the second
FITS axis.
‘--min-z’
The minimum position of all objects and clumps along the third FITS
axis.
‘--max-z’
The maximum position of all objects and clumps along the third FITS
axis.
‘--clumps-x’
[Objects] The flux weighted center of all the clumps in this object
along the first FITS axis. See ‘--x’.
‘--clumps-y’
[Objects] The flux weighted center of all the clumps in this object
along the second FITS axis. See ‘--x’.
‘--clumps-z’
[Objects] The flux weighted center of all the clumps in this object
along the third FITS axis. See ‘--x’.
‘--clumps-geo-x’
[Objects] The geometric center of all the clumps in this object
along the first FITS axis. See ‘--geo-x’.
‘--clumps-geo-y’
[Objects] The geometric center of all the clumps in this object
along the second FITS axis. See ‘--geo-x’.
‘--clumps-geo-z’
[Objects] The geometric center of all the clumps in this object
along the third FITS axis. See ‘--geo-z’.
File: gnuastro.info, Node: Position measurements in WCS, Next: Brightness measurements, Prev: Position measurements in pixels, Up: MakeCatalog measurements
7.4.6.3 Position measurements in WCS
....................................
The position of a labeled region within your input dataset (in the World
Coordinate System, or WCS) can be measured with the options in this
section. As you see below, there are various ways to define the
"position" of an object, so read the differences carefully to choose the
one that corresponds best to your usage.
The most common WCS coordinates are Right Ascension (RA) and
Declination in an equatorial system. Therefore, to simplify their
usage, we have special ‘--ra’ and ‘--dec’ options. However, the WCS of
datasets are in Galactic coordinates, so to be generic, you can use the
‘--w1’, ‘--w2’ or ‘--w3’ (if you have a 3D cube) options. In case your
dataset's WCS is not in your desired system (for example it is Galactic,
but you want equatorial 2000), you can use the ‘--wcscoordsys’ option of
Gnuastro's Fits program on the labeled image before running MakeCatalog
(see *note Keyword inspection and manipulation::).
‘-r’
‘--ra’
Flux weighted right ascension of all objects or clumps, see ‘--x’.
This is just an alias for one of the lower-level ‘--w1’ or ‘--w2’
options. Using the FITS WCS keywords (‘CTYPE’), MakeCatalog will
determine which axis corresponds to the right ascension. If no
‘CTYPE’ keywords start with ‘RA’, an error will be printed when
requesting this column and MakeCatalog will abort.
‘-d’
‘--dec’
Flux weighted declination of all objects or clumps, see ‘--x’.
This is just an alias for one of the lower-level ‘--w1’ or ‘--w2’
options. Using the FITS WCS keywords (‘CTYPE’), MakeCatalog will
determine which axis corresponds to the declination. If no ‘CTYPE’
keywords start with ‘DEC’, an error will be printed when requesting
this column and MakeCatalog will abort.
‘--w1’
Flux weighted first WCS axis of all objects or clumps, see ‘--x’.
The first WCS axis is commonly used as right ascension in images.
‘--w2’
Flux weighted second WCS axis of all objects or clumps, see ‘--x’.
The second WCS axis is commonly used as declination in images.
‘--w3’
Flux weighted third WCS axis of all objects or clumps, see ‘--x’.
The third WCS axis is commonly used as wavelength in integral field
unit data cubes.
‘--geo-w1’
Geometric center in first WCS axis of all objects or clumps, see
‘--geo-x’. The first WCS axis is commonly used as right ascension
in images.
‘--geo-w2’
Geometric center in second WCS axis of all objects or clumps, see
‘--geo-x’. The second WCS axis is commonly used as declination in
images.
‘--geo-w3’
Geometric center in third WCS axis of all objects or clumps, see
‘--geo-x’. The third WCS axis is commonly used as wavelength in
integral field unit data cubes.
‘--clumps-w1’
[Objects] Flux weighted center in first WCS axis of all clumps in
this object, see ‘--x’. The first WCS axis is commonly used as
right ascension in images.
‘--clumps-w2’
[Objects] Flux weighted declination of all clumps in this object,
see ‘--x’. The second WCS axis is commonly used as declination in
images.
‘--clumps-w3’
[Objects] Flux weighted center in third WCS axis of all clumps in
this object, see ‘--x’. The third WCS axis is commonly used as
wavelength in integral field unit data cubes.
‘--clumps-geo-w1’
[Objects] Geometric center right ascension of all clumps in this
object, see ‘--geo-x’. The first WCS axis is commonly used as
right ascension in images.
‘--clumps-geo-w2’
[Objects] Geometric center declination of all clumps in this
object, see ‘--geo-x’. The second WCS axis is commonly used as
declination in images.
‘--clumps-geo-w3’
[Objects] Geometric center in third WCS axis of all clumps in this
object, see ‘--geo-x’. The third WCS axis is commonly used as
wavelength in integral field unit data cubes.
File: gnuastro.info, Node: Brightness measurements, Next: Surface brightness measurements, Prev: Position measurements in WCS, Up: MakeCatalog measurements
7.4.6.4 Brightness measurements
...............................
Within an image, pixels have both a position and a value. In the
sections above all the measurements involved position (see *note
Position measurements in pixels:: or *note Position measurements in
WCS::). The measurements in this section only deal with pixel values
and ignore the pixel positions completely. In other words, for the
options of this section each labeled region within the input is just a
group of values (and their associated error values if necessary), and
they let you do various types of measurements on the resulting
distribution of values.
‘--sum’
The sum of all pixel values associated to this label (object or
clump). Note that if a sky value or image has been given, it will
be subtracted before any column measurement. For clumps, the
ambient values (average of river pixels around the clump,
multiplied by the area of the clump) is subtracted, see
‘--river-mean’. So the sum of all the clump-sums in the clump
catalog of one object will be smaller than the ‘--clumps-sum’
column of the objects catalog.
If no usable pixels are present over the clump or object (for
example, they are all blank), the returned value will be NaN (note
that zero is meaningful).
‘--sum-error’
The ($1\sigma$) error in measuring the sum of values of a label
(objects or clumps).
The returned value will be NaN when the label covers only NaN
pixels in the values image, or a pixel is NaN in the ‘--instd’
image, but non-NaN in the values image. The latter situation
usually happens when there is a bug in the previous steps of your
analysis, and is important because those pixels with a NaN in the
‘--instd’ image may contribute significantly to the final error.
If you want to ignore those pixels in the error measurement, set
them to zero (which is a meaningful number in such scenarios).
‘--clumps-sum’
[Objects] The total sum of the pixels covered by clumps (before
subtracting the river) within each object. This is simply the sum
of ‘--sum-no-river’ in the clumps catalog (see below). If no
usable pixels are present over the clump or object (for example,
they are all blank), the stored value will be NaN (note that zero
is meaningful).
‘--sum-no-river’
[Clumps] The sum of Sky (not river) subtracted clump pixel values.
By definition, for the clumps, the average value of the rivers
surrounding it are subtracted from it for a first order accounting
for contamination by neighbors.
If no usable pixels are present over the clump or object (for
example, they are all blank), the stored value will be NaN (note
that zero is meaningful).
‘--mean’
The mean sky subtracted value of pixels within the object or clump.
For clumps, the average river flux is subtracted from the sky
subtracted mean.
‘--std’
The standard deviation of the pixels within the object or clump.
For clumps, the river pixels are not subtracted because that is a
constant (per pixel) value and should not affect the standard
deviation.
‘--median’
The median sky subtracted value of pixels within the object or
clump. For clumps, the average river flux is subtracted from the
sky subtracted median.
‘--maximum’
The maximum value of pixels within the object or clump. When the
label (object or clump) is larger than three pixels, the maximum is
actually derived by the mean of the brightest three pixels, not the
largest pixel value of the same label. This is because noise
fluctuations can be very strong in the extreme values of the
objects/clumps due to Poisson noise (which gets stronger as the
mean gets higher). Simply using the maximum pixel value will
create a strong scatter in results that depend on the maximum (for
example, the ‘--fwhm’ option also uses this value internally).
‘--sigclip-number’
The number of elements/pixels in the dataset after sigma-clipping
the object or clump. The sigma-clipping parameters can be set with
the ‘--sigmaclip’ option described in *note MakeCatalog inputs and
basic settings::. For more on Sigma-clipping, see *note Sigma
clipping::.
‘--sigclip-median’
The sigma-clipped median value of the object of clump's pixel
distribution. For more on sigma-clipping and how to define it, see
‘--sigclip-number’.
‘--sigclip-mean’
The sigma-clipped mean value of the object of clump's pixel
distribution. For more on sigma-clipping and how to define it, see
‘--sigclip-number’.
‘--sigclip-std’
The sigma-clipped standard deviation of the object of clump's pixel
distribution. For more on sigma-clipping and how to define it, see
‘--sigclip-number’.
‘-m’
‘--magnitude’
The magnitude of clumps or objects, see ‘--sum’.
‘--magnitude-error’
The magnitude error of clumps or objects. The magnitude error is
calculated from the signal-to-noise ratio (see ‘--sn’ and *note
Quantifying measurement limits::). Note that until now this error
assumes uncorrelated pixel values and also does not include the
error in estimating the aperture (or error in generating the
labeled image).
For now these factors have to be found by other means. Task 14124
(https://savannah.gnu.org/task/index.php?14124) has been defined
for work on adding these sources of error too.
The returned value will be NaN when the label covers only NaN
pixels in the values image, or a pixel is NaN in the ‘--instd’
image, but non-NaN in the values image. The latter situation
usually happens when there is a bug in the previous steps of your
analysis, and is important because those pixels with a NaN in the
‘--instd’ image may contribute significantly to the final error.
If you want to ignore those pixels in the error measurement, set
them to zero (which is a meaningful number in such scenarios).
‘--clumps-magnitude’
[Objects] The magnitude of all clumps in this object, see
‘--clumps-sum’.
‘--upperlimit’
The upper limit value (in units of the input image) for this object
or clump. This is the sigma-clipped standard deviation of the
random distribution, multiplied by the value of ‘--upnsigma’). See
*note Quantifying measurement limits:: and *note Upper-limit
settings:: for a complete explanation. This is very important for
the fainter and smaller objects in the image where the measured
magnitudes are not reliable.
‘--upperlimit-mag’
The upper limit magnitude for this object or clump. See *note
Quantifying measurement limits:: and *note Upper-limit settings::
for a complete explanation. This is very important for the fainter
and smaller objects in the image where the measured magnitudes are
not reliable.
‘--upperlimit-onesigma’
The $1\sigma$ upper limit value (in units of the input image) for
this object or clump. See *note Quantifying measurement limits::
and *note Upper-limit settings:: for a complete explanation. When
‘--upnsigma=1’, this column's values will be the same as
‘--upperlimit’.
‘--upperlimit-sigma’
The position of the label's sum measured within the distribution of
randomly placed upperlimit measurements in units of the
distribution's $\sigma$ or standard deviation. See *note
Quantifying measurement limits:: and *note Upper-limit settings::
for a complete explanation.
‘--upperlimit-quantile’
The position of the label's sum within the distribution of randomly
placed upperlimit measurements as a quantile (value between 0 or
1). See *note Quantifying measurement limits:: and *note
Upper-limit settings:: for a complete explanation. If the object
is brighter than the brightest randomly placed profile, a value of
‘inf’ is returned. If it is less than the minimum, a value of
‘-inf’ is reported.
‘--upperlimit-skew’
This column contains the non-parametric skew of the
$\sigma$-clipped random distribution that was used to estimate the
upper-limit magnitude. Taking $\mu$ as the mean, $\nu$ as the
median and $\sigma$ as the standard deviation, the traditional
definition of skewness is defined as: $(\mu-\nu)/\sigma$.
This can be a good measure to see how much you can trust the random
measurements, or in other words, how accurately the regions with
signal have been masked/detected. If the skewness is strong (and
to the positive), then you can tell that you have a lot of
undetected signal in the dataset, and therefore that the
upper-limit measurement (and other measurements) are not reliable.
‘--river-mean’
[Clumps] The average of the river pixel values around this clump.
River pixels were defined in Akhlaghi and Ichikawa 2015
(https://arxiv.org/abs/1505.01664). In short they are the pixels
immediately outside of the clumps. This value is used internally
to find the sum (or magnitude) and signal to noise ratio of the
clumps. It can generally also be used as a scale to gauge the base
(ambient) flux surrounding the clump. In case there was no river
pixels, then this column will have the value of the Sky under the
clump. So note that this value is _not_ sky subtracted.
‘--river-num’
[Clumps] The number of river pixels around this clump, see
‘--river-mean’.
‘--river-min’
[Clumps] Minimum river value around this clump, see ‘--river-mean’.
‘--river-max’
[Clumps] Maximum river value around this clump, see ‘--river-mean’.
‘--sn’
The Signal to noise ratio (S/N) of all clumps or objects. See
Akhlaghi and Ichikawa (2015) for the exact equations used.
The returned value will be NaN when the label covers only NaN
pixels in the values image, or a pixel is NaN in the ‘--instd’
image, but non-NaN in the values image. The latter situation
usually happens when there is a bug in the previous steps of your
analysis, and is important because those pixels with a NaN in the
‘--instd’ image may contribute significantly to the final error.
If you want to ignore those pixels in the error measurement, set
them to zero (which is a meaningful number).
‘--sky’
The sky flux (per pixel) value under this object or clump. This is
actually the mean value of all the pixels in the sky image that lie
on the same position as the object or clump.
‘--sky-std’
The sky value standard deviation (per pixel) for this clump or
object. This is the square root of the mean variance under the
object, or the root mean square.
File: gnuastro.info, Node: Surface brightness measurements, Next: Morphology measurements nonparametric, Prev: Brightness measurements, Up: MakeCatalog measurements
7.4.6.5 Surface brightness measurements
.......................................
In astronomy, Surface brightness is most commonly measured in units of
magnitudes per arcsec$^2$ (for the formal definition, see *note
Brightness flux magnitude::). Therefore it involves both the values of
the pixels within each input label (or output row) and their position.
‘--sb’
The surface brightness (in units of mag/arcsec$^2$) of the labeled
region (objects or clumps). For more on the definition of the
surface brightness, see *note Brightness flux magnitude::.
‘--sb-error’
Error in measuring the surface brightness (the ‘--sb’ column).
This column will use the value given to ‘--spatialresolution’ in
the processing (in pixels). For more on ‘--spatialresolution’, see
*note MakeCatalog inputs and basic settings:: and for the equation
used to derive the surface brightness error, see *note Surface
brightness error of each detection::.
‘--upperlimit-sb’
The upper-limit surface brightness (in units of mag/arcsec$^2$) of
this labeled region (object or clump). In other words, this option
measures the surface brightness of noise within the footprint of
each input label.
This is just a simple wrapper over lower-level columns: setting B
and A as the value in the columns ‘--upperlimit’ and
‘--area-arcsec2’, we fill this column by simply use the surface
brightness equation of *note Brightness flux magnitude::.
‘--half-sum-sb’
Surface brightness (in units of mag/arcsec$^2$) within the area
that contains half the total sum of the label's pixels (object or
clump). This is useful as a measure of the sharpness of an
astronomical object: for example a star will have very few pixels
at half the maximum, so its ‘--half-sum-sb’ will be much brighter
than a galaxy at the same magnitude. Also consider ‘--half-max-sb’
below.
This column just plugs in the values of half the value of the
‘--sum’ column and the ‘--half-sum-area’ column, into the surface
brightness equation. Therefore please see the description in
‘--half-sum-area’ to understand the systematics of this column and
potential biases (see *note Morphology measurements
nonparametric::).
‘--half-max-sb’
The surface brightness (in units of mag/arcsec$^2$) within the
region that contains half the maximum value of the labeled region.
Like ‘--half-sum-sb’ this option this is a good way to identify the
"central" surface brightness of an object. To know when this
measurement is reasonable, see the description of ‘--fwhm’ in *note
Morphology measurements nonparametric::.
‘--sigclip-mean-sb’
Surface brightness (over 1 pixel's area in arcsec$^2$) of the
sigma-clipped mean value of the pixel values distribution
associated to each label (object or clump). This is useful in
scenarios where your labels have approximately _constant_ surface
brightness values _after_ after removing outliers: for example in a
radial profile, see *note Invoking astscript-radial-profile::).
In other scenarios it should be used with extreme care. For
example over the full area of a galaxy/star the pixel distribution
is not constant (or symmetric after adding noise), their pixel
distributions are inherently skewed (with fewer pixels in the
center, having a very large value and many pixels in the outer
parts having lower values). Therefore, sigma-clipping is not
meaningful for such objects! For more on the definition of the
surface brightness, see *note Brightness flux magnitude::, for more
on sigma-clipping, see *note Sigma clipping::.
The error in this magnitude can be retrieved from the
‘--sigclip-mean-sb-delta’ column described below, and you can use
the ‘--sigclip-std-sb’ column to find when the magnitude has become
noise-dominated (signal-to-noise ratio is roughly 1). See the
description of these two options for more.
‘--sigclip-mean-sb-delta’
Scatter in the ‘--sigclip-mean-sb’ without using the standard
deviation of each pixel (that is given by ‘--instd’ in *note
MakeCatalog inputs and basic settings::). The scatter here is
measured from the values of the label themselves. This measurement
is therefore most meaningful when you expect the flux across one
label to be constant (as in a radial profile for example).
This is calculated using the equation in *note Surface brightness
error of each detection::, where $\Delta{A}=0$ (since sigma-clip is
calculated per pixel and there is no error in a single pixel).
Within the equation to derive $\Delta{M}$ (the error in magnitude,
derived in *note Magnitude measurement error of each detection::),
the signal-to-noise ratio is defined by dividing the sigma-clipped
mean by the sigma-clipped standard deviation.
‘--sigclip-std-sb’
The surface brightness of the sigma-clipped standard deviation of
all the pixels with the same label. For labels that are expected
to have the same value in all their pixels (for example each
annulus of a radial profile) this can be used to find the reliable
($1\sigma$) surface brightness for that label. In other words, if
‘--sigclip-mean-sb’ is fainter than the value of this column, you
know that noise is becoming significant. However, as described in
‘--sigclip-mean-sb’, the sigma-clipped measurements of MakeCatalog
should only be used in certain situations like radial profiles, see
the description there for more.
File: gnuastro.info, Node: Morphology measurements nonparametric, Next: Morphology measurements elliptical, Prev: Surface brightness measurements, Up: MakeCatalog measurements
7.4.6.6 Morphology measurements (non-parametric)
................................................
Morphology defined as a way to quantify the "shape" of an object in your
input image. This includes both the position and value of the pixels
within your input labels. There are many ways to define the morphology
of an object. In this section, we will review the available
non-parametric measures of morphology. By non-parametric, we mean that
no functional shape is assumed for the measurement.
In *note Morphology measurements elliptical:: you can see some
parametric elliptical measurements (which are only valid when the object
is actually an ellipse).
‘--num-clumps’
[Objects] The number of clumps in this object.
‘--area’
The raw area (number of pixels/voxels) in any clump or object
independent of what pixel it lies over (if it is NaN/blank or
unused for example).
‘--arcsec2-area’
The used (non-blank in values image) area of the labeled region in
units of arc-seconds squared. This column is just the value of the
‘--area’ column, multiplied by the area of each pixel in the input
image (in units of arcsec^2). Similar to the ‘--ra’ or ‘--dec’
columns, for this option to work, the objects extension used has to
have a WCS structure.
‘--area-min-val’
The number of pixels that are equal to the minimum value of the
labeled region (clump or object).
‘--area-max-val’
The number of pixels that are equal to the maximum value of the
labeled region (clump or object).
‘--area-xy’
Similar to ‘--area’, when the clump or object is projected onto the
first two dimensions. This is only available for 3-dimensional
datasets. When working with Integral Field Unit (IFU) datasets,
this projection onto the first two dimensions would be a
narrow-band image.
‘--fwhm’
The full width at half maximum (in units of pixels, along the
semi-major axis) of the labeled region (object or clump). The
maximum value is estimated from the mean of the top-three pixels
with the highest values, see the description under ‘--maximum’.
The number of pixels that have half the value of that maximum are
then found (value in the ‘--half-max-area’ column) and a radius is
estimated from the area. See the description under
‘--half-sum-radius’ for more on converting area to radius along
major axis.
Because of its non-parametric nature, this column is most reliable
on clumps and should only be used in objects with great caution.
This is because objects can have more than one clump (peak with
true signal) and multiple peaks are not treated separately in
objects, so the result of this column will be biased.
Also, because of its non-parametric nature, this FWHM it does not
account for the PSF, and it will be strongly affected by noise if
the object is faint/diffuse So when half the maximum value (which
can be requested using the ‘--maximum’ column) is too close to the
local noise level (which can be requested using the ‘--sky-std’
column), the value returned in this column is meaningless (its just
noise peaks which are randomly distributed over the area). You can
therefore use the ‘--maximum’ and ‘--sky-std’ columns to remove, or
flag, unreliable FWHMs. For example, if a labeled region's maximum
is less than 2 times the sky standard deviation, the value will
certainly be unreliable (half of that is $1\sigma$!). For a more
reliable value, this fraction should be around 4 (so half the
maximum is 2$\sigma$).
‘--half-max-area’
The number of pixels with values larger than half the maximum flux
within the labeled region. This option is used to estimate
‘--fwhm’, so please read the notes there for the caveats and
necessary precautions.
‘--half-max-radius’
The radius of region containing half the maximum flux within the
labeled region. This is just half the value reported by ‘--fwhm’.
‘--half-max-sum’
The sum of the pixel values containing half the maximum flux within
the labeled region (or those that are counted in ‘--halfmaxarea’).
This option uses the pixels within ‘--fwhm’, so please read the
notes there for the caveats and necessary precautions.
‘--half-sum-area’
The number of pixels that contain half the object or clump's total
sum of pixels (half the value in the ‘--sum’ column). To count
this area, all the non-blank values associated with the given label
(object or clump) will be sorted and summed in order (starting from
the maximum), until the sum becomes larger than half the total sum
of the label's pixels.
This option is thus good for clumps (which are defined to have a
single peak in their morphology), but for objects you should be
careful: if the object includes multiple peaks/clumps at roughly
the same level, then the area reported by this option will be
distributed over all the peaks.
‘--half-sum-radius’
Radius (in units of pixels) derived from the area that contains
half the total sum of the label's pixels (value reported by
‘--halfsumarea’). If the area is $A_h$ and the axis ratio is $q$,
then the value returned in this column is $\sqrt{A_h/({\pi}q)}$.
This option is a good measure of the concentration of the
_observed_ (after PSF convolution and noisy) object or clump, But
as described below it underestimates the effective radius. Also,
it should be used in caution with objects that may have multiple
clumps. It is most reliable with clumps or objects that have one
or zero clumps, see the note under ‘--halfsumarea’.
Recall that in general, for an ellipse with semi-major axis $a$,
semi-minor axis $b$, and axis ratio $q=b/a$ the area ($A$) is
$A={\pi}ab={\pi}qa^2$. For a circle (where $q=1$), this simplifies
to the familiar $A={\pi}a^2$.
This option should not be confused with the _effective radius_ for
Sérsic profiles, commonly written as $r_e$. For more on the Sérsic
profile and $r_e$, please see *note Galaxies::. Therefore, when
$r_e$ is meaningful for the target (the target is elliptically
symmetric and can be parameterized as a Sérsic profile), $r_e$
should be derived from fitting the profile with a Sérsic function
which has been convolved with the PSF. But from the equation above,
you see that this radius is derived from the raw image's labeled
values (after convolution, with no parametric profile), so this
column's value will generally be (much) smaller than $r_e$,
depending on the PSF, depth of the dataset, the morphology, or if a
fraction of the profile falls on the edge of the image.
In other words, this option can only be interpreted as an effective
radius if there is no noise and no PSF and the profile within the
image extends to infinity (or a very large multiple of the
effective radius) and it not near the edge of the image.
‘--frac-max1-area’
‘--frac-max2-area’
Number of pixels brighter than the given fraction(s) of the maximum
pixel value. For the maximum value, see the description of
‘--maximum’ column. The fraction(s) are given through the
‘--frac-max’ option (that can take two values) and is described in
*note MakeCatalog inputs and basic settings::. Recall that in
‘--halfmaxarea’, the fraction is fixed to 0.5. Hence, added with
these two columns, you can sample three parts of the profile area.
‘--frac-max1-sum’
‘--frac-max2-sum’
Sum of pixels brighter than the given fraction(s) of the maximum
pixel value. For the maximum value, see the description of
‘--maximum’ column below. The fraction(s) are given through the
‘--frac-max’ option (that can take two values) and is described in
*note MakeCatalog inputs and basic settings::. Recall that in
‘--halfmaxsum’, the fraction is fixed to 0.5. Hence, added with
these two columns, you can sample three parts of the profile's sum
of pixels.
‘--frac-max1-radius’
‘--frac-max2-radius’
Radius (in units of pixels) derived from the area that contains the
given fractions of the maximum valued pixel(s) of the label's
pixels (value reported by ‘--frac-max1-area’ or
‘--frac-max2-area’). For the maximum value, see the description of
‘--maximum’ column below. The fractions are given through the
‘--frac-max’ option (that can take two values) and is described in
*note MakeCatalog inputs and basic settings::. Recall that in
‘--fwhm’, the fraction is fixed to 0.5. Hence, added with these
two columns, you can sample three parts of the profile's radius.
‘--clumps-area’
[Objects] The total area of all the clumps in this object.
‘--weight-area’
The area (number of pixels) used in the flux weighted position
calculations.
‘--geo-area’
The area of all the pixels labeled with an object or clump. Note
that unlike ‘--area’, pixel values are completely ignored in this
column. For example, if a pixel value is blank, it will not be
counted in ‘--area’, but will be counted here.
‘--geo-area-xy’
Similar to ‘--geo-area’, when the clump or object is projected onto
the first two dimensions. This is only available for 3-dimensional
datasets. When working with Integral Field Unit (IFU) datasets,
this projection onto the first two dimensions would be a
narrow-band image.
File: gnuastro.info, Node: Morphology measurements elliptical, Next: Measurements per slice spectra, Prev: Morphology measurements nonparametric, Up: MakeCatalog measurements
7.4.6.7 Morphology measurements (elliptical)
............................................
When your target objects are sufficiently ellipse-like, you can use the
measurements below to quantify the various parameters of the ellipse.
For details of how the elliptical parameters are measured, see *note
Measuring elliptical parameters::. For non-parametric morphological
measurements, see *note Morphology measurements nonparametric::. The
measures that start with ‘--geo-*’ ignore the pixel values and just do
the measurements on the label's "geometric" shape.
‘--semi-major’
The pixel-value weighted root mean square (RMS) along the
semi-major axis of the profile (assuming it is an ellipse) in units
of pixels.
‘--semi-minor’
The pixel-value weighted root mean square (RMS) along the
semi-minor axis of the profile (assuming it is an ellipse) in units
of pixels.
‘--axis-ratio’
The pixel-value weighted axis ratio (semi-minor/semi-major) of the
object or clump.
‘--position-angle’
The pixel-value weighted angle of the semi-major axis with the
first FITS axis in degrees.
‘--geo-semi-major’
The geometric (ignoring pixel values) root mean square (RMS) along
the semi-major axis of the profile, assuming it is an ellipse, in
units of pixels.
‘--geo-semi-minor’
The geometric (ignoring pixel values) root mean square (RMS) along
the semi-minor axis of the profile, assuming it is an ellipse, in
units of pixels.
‘--geo-axis-ratio’
The geometric (ignoring pixel values) axis ratio of the profile,
assuming it is an ellipse.
‘--geo-position-angle’
The geometric (ignoring pixel values) angle of the semi-major axis
with the first FITS axis in degrees.
File: gnuastro.info, Node: Measurements per slice spectra, Prev: Morphology measurements elliptical, Up: MakeCatalog measurements
7.4.6.8 Measurements per slice (spectra)
........................................
When the input is a 3D data cube, MakeCatalog has the following
multi-valued measurements per label. For a tutorial on how to use these
options and interpret their values, see *note Detecting lines and
extracting spectra in 3D data::.
These options will do measurements on each 2D slice of the input 3D
cube; hence the common the format of ‘--*-in-slice’. Each slice usually
corresponds to a certain wavelength, you can also think of these
measurements as spectra.
For each row (input label), each of the columns described here will
contain multiple values as a vector column. The number of measurements
in each column is the number of slices in the cube, or the size of the
cube along the third dimension. To learn more about vector columns and
how to manipulate them, see *note Vector columns::. For example usage
of these columns in the tutorial above, see *note 3D measurements and
spectra:: and *note Extracting a single spectrum and plotting it::.
There are two ways to do each measurement on a slice for each label:
Only label
The measurement will only be done on the voxels in the slice that
are associated to that label. These types of per-slice measurement
therefore have the following properties:
• This will only be a measurement of that label and will not be
affected by any other label.
• The number of voxels used in each slice can be different
(usually only one or two voxels at the two extremes of the
label (along the third dimension), and many in the middle.
• Since most labels are localized along the third dimension
(maybe only covering 20 slices out of thousands!), many of the
measurements (on slices where the label doesn't exist) will be
NaN (for the sum measurements for example) or 0 (for the area
measurements).
Projected label
MakeCatalog will first project the 3D label into a 2D surface
(along the third dimension) to get its 2D footprint. Afterwards,
all the voxels in that 2D footprint will be measured all slices.
All these measurements will have a ‘-proj-’ component in their
name. These types of per-slice measurement therefore has the
following properties:
• A measurement will be done on each slice of the cube.
• All measurements will be done on the same surface area.
• Labels can overlap when they are projected onto the first two
FITS dimensions (the spatial coordinates, not spectral). As a
result, other emission lines or objects may contaminate the
resulting spectrum for each label.
To help separate other labels, MakeCatalog can do a third type of
measurement on each slice: measurements on the voxels that belong
to other labels but overlap with the 2D projection. This can be
used to see how much your projected measurement is affected by
other emission sources (on the projected spectra) and also if
multiple lines (labeled regions) belong to the same physical
object. These measurements contain ‘-other-’ in their name.
‘--sum-in-slice’
[Only label] Sum of values in each slice.
‘--sum-err-in-slice’
[Only label] Error in '-sum-in-slice'.
‘--area-in-slice’
[Only label] Number of labeled in each slice.
‘--sum-proj-in-slice’
[Projected label] Sum of projected area in each slice.
‘--area-proj-in-slice:’
[Projected label] Number of voxels that are used in
‘--sum-proj-in-slice’.
‘--sum-proj-err-in-slice’
[Projected label] Error of ‘--sum-proj-in-slice’.
‘--area-other-in-slice’
[Projected label] Area of other label in projected area on each
slice.
‘--sum-other-in-slice’
[Projected label] Sum of other label in projected area on each
slice.
‘--sum-other-err-in-slice:’
[Projected label] Area in ‘--sum-other-in-slice’.
File: gnuastro.info, Node: Invoking astmkcatalog, Prev: MakeCatalog measurements, Up: MakeCatalog
7.4.7 Invoking MakeCatalog
--------------------------
MakeCatalog will do measurements and produce a catalog from a labeled
dataset and optional values dataset(s). The executable name is
‘astmkcatalog’ with the following general template
$ astmkcatalog [OPTION ...] InputImage.fits
One line examples:
## Create catalog with RA, Dec, Magnitude and Magnitude error,
## from Segment's output:
$ astmkcatalog --ra --dec --magnitude seg-out.fits
## Same catalog as above (using short options):
$ astmkcatalog -rdm seg-out.fits
## Write the catalog to a text table:
$ astmkcatalog -rdm seg-out.fits --output=cat.txt
## Output columns specified in `columns.conf':
$ astmkcatalog --config=columns.conf seg-out.fits
## Use object and clump labels from a K-band image, but pixel values
## from an i-band image.
$ astmkcatalog K_segmented.fits --hdu=DETECTIONS --clumpscat \
--clumpsfile=K_segmented.fits --clumpshdu=CLUMPS \
--valuesfile=i_band.fits
If MakeCatalog is to do processing (not printing help or option values),
an input labeled image should be provided. The options described in
this section are those that are particular to MakeProfiles. For
operations that MakeProfiles shares with other programs (mainly
involving input/output or general processing steps), see *note Common
options::. Also see *note Common program behavior:: for some general
characteristics of all Gnuastro programs including MakeCatalog.
The various measurements/columns of MakeCatalog are requested as
options, either on the command-line or in configuration files, see *note
Configuration files::. The full list of available columns is available
in *note MakeCatalog measurements::. Depending on the requested
columns, MakeCatalog needs more than one input dataset, for more
details, please see *note MakeCatalog inputs and basic settings::. The
upper-limit measurements in particular need several configuration
options which are thoroughly discussed in *note Upper-limit settings::.
Finally, in *note MakeCatalog output:: the output file(s) created by
MakeCatalog are discussed.
* Menu:
* MakeCatalog inputs and basic settings:: Input files and basic settings.
* Upper-limit settings:: Settings for upper-limit measurements.
* MakeCatalog output:: File names of MakeCatalog's output table.
File: gnuastro.info, Node: MakeCatalog inputs and basic settings, Next: Upper-limit settings, Prev: Invoking astmkcatalog, Up: Invoking astmkcatalog
7.4.7.1 MakeCatalog inputs and basic settings
.............................................
MakeCatalog works by using a localized/labeled dataset (see *note
MakeCatalog::). This dataset maps/labels pixels to a specific target
(row number in the final catalog) and is thus the only necessary input
dataset to produce a minimal catalog in any situation. Because it only
has labels/counters, it must have an integer type (see *note Numeric
data types::), see below if your labels are in a floating point
container. When the requested measurements only need this dataset (for
example, ‘--geo-x’, ‘--geo-y’, or ‘--geo-area’), MakeCatalog will not
read any more datasets.
Low-level measurements that only use the labeled image are rarely
sufficient for any high-level science case. Therefore necessary input
datasets depend on the requested columns in each run. For example,
let's assume you want the brightness/magnitude and signal-to-noise ratio
of your labeled regions. For these columns, you will also need to
provide an extra dataset containing values for every pixel of the
labeled input (to measure magnitude) and another for the Sky standard
deviation (to measure error). All such auxiliary input files have to
have the same size (number of pixels in each dimension) as the input
labeled image. Their numeric data type is irrelevant (they will be
converted to 32-bit floating point internally). For the full list of
available measurements, see *note MakeCatalog measurements::.
The "values" dataset is used for measurements like
brightness/magnitude, or flux-weighted positions. If it is a real
image, by default it is assumed to be already Sky-subtracted prior to
running MakeCatalog. If it is not, you use the ‘--subtractsky’ option
to, so MakeCatalog reads and subtracts the Sky dataset before any
processing. To obtain the Sky value, you can use the ‘--sky’ option of
*note Statistics::, but the best recommended method is *note
NoiseChisel::, see *note Sky value::.
MakeCatalog can also do measurements on sub-structures of detections.
In other words, it can produce two catalogs. Following the nomenclature
of Segment (see *note Segment::), the main labeled input dataset is
known as "object" labels and the (optional) sub-structure input dataset
is known as "clumps". If MakeCatalog is run with the ‘--clumpscat’
option, it will also need a labeled image containing clumps, similar to
what Segment produces (see *note Segment output::). Since clumps are
defined within detected regions (they exist over signal, not noise),
MakeCatalog uses their boundaries to subtract the level of signal under
them.
There are separate options to explicitly request a file name and
HDU/extension for each of the required input datasets as fully described
below (with the ‘--*file’ format). When each dataset is in a separate
file, these options are necessary. However, one great advantage of the
FITS file format (that is heavily used in astronomy) is that it allows
the storage of multiple datasets in one file. So in most situations
(for example, if you are using the outputs of *note NoiseChisel:: or
*note Segment::), all the necessary input datasets can be in one file.
When none of the ‘--*file’ options are given (for example
‘--clumpsfile’ or ‘--valuesfile’), MakeCatalog will assume the necessary
input datasets are available as HDUs in the file given as its argument
(without any option). When the Sky or Sky standard deviation datasets
are necessary and the only ‘--*file’ option called is ‘--valuesfile’,
MakeCatalog will search for these datasets (with the default/given HDUs)
in the file given to ‘--valuesfile’ (before looking into the main
argument file).
It may happen that your labeled objects image was created with a
program that only outputs floating point files. However, you know it
only has integer valued pixels that are stored in a floating point
container. In such cases, you can use Gnuastro's Arithmetic program
(see *note Arithmetic::) to change the numerical data type of the image
(‘float.fits’) to an integer type image (‘int.fits’) with a command like
below:
$ astarithmetic float.fits int32 --output=int.fits
To summarize: if the input file to MakeCatalog is the default/full
output of Segment (see *note Segment output::) you do not have to worry
about any of the ‘--*file’ options below. You can just give Segment's
output file to MakeCatalog as described in *note Invoking
astmkcatalog::. To feed NoiseChisel's output into MakeCatalog, just
change the labeled dataset's header (with ‘--hdu=DETECTIONS’). The full
list of input dataset options and general setting options are described
below.
‘-l FITS’
‘--clumpsfile=FITS’
The FITS file containing the labeled clumps dataset when
‘--clumpscat’ is called (see *note MakeCatalog output::). When
‘--clumpscat’ is called, but this option is not, MakeCatalog will
look into the main input file (given as an argument) for the
required extension/HDU (value to ‘--clumpshdu’).
‘--clumpshdu=STR’
The HDU/extension of the clump labels dataset. Only pixels with
values above zero will be considered. The clump labels dataset has
to be an integer data type (see *note Numeric data types::) and
only pixels with a value larger than zero will be used. See *note
Segment output:: for a description of the expected format.
‘-v FITS’
‘--valuesfile=FITS’
The file name of the (sky-subtracted) values dataset. When any of
the columns need values to associate with the input labels (for
example, to measure the sum of pixel values or magnitude of a
galaxy, see *note Brightness flux magnitude::), MakeCatalog will
look into a "values" for the respective pixel values. In most
common processing, this is the actual astronomical image that the
labels were defined, or detected, over. The HDU/extension of this
dataset in the given file can be specified with ‘--valueshdu’. If
this option is not called, MakeCatalog will look for the given
extension in the main input file.
‘--valueshdu=STR/INT’
The name or number (counting from zero) of the extension containing
the "values" dataset, see the descriptions above and those in
‘--valuesfile’ for more.
‘-s FITS/FLT’
‘--insky=FITS/FLT’
Sky value as a single number, or the file name containing a dataset
(different values per pixel or tile). The Sky dataset is only
necessary when ‘--subtractsky’ is called or when a column directly
related to the Sky value is requested (currently ‘--sky’). This
dataset may be a tessellation, with one element per tile (see
‘--oneelempertile’ of NoiseChisel's *note Processing options::).
When the Sky dataset is necessary but this option is not called,
MakeCatalog will assume it is an HDU/extension (specified by
‘--skyhdu’) in one of the already given files. First it will look
for it in the ‘--valuesfile’ (if it is given) and then the main
input file (given as an argument).
By default the values dataset is assumed to be already Sky
subtracted, so this dataset is not necessary for many of the
columns.
‘--skyhdu=STR’
HDU/extension of the Sky dataset, see ‘--skyfile’.
‘--subtractsky’
Subtract the sky value or dataset from the values file prior to any
processing.
‘-t STR/FLT’
‘--instd=STR/FLT’
Sky standard deviation value as a single number, or the file name
containing a dataset (different values per pixel or tile). With
the ‘--variance’ option you can tell MakeCatalog to interpret this
value/dataset as a variance image, not standard deviation.
*Important note:* This must only be the SKY standard deviation or
variance (not including the signal's contribution to the error).
In other words, the final standard deviation of a pixel depends on
how much signal there is in it. MakeCatalog will find the amount
of signal within each pixel (while subtracting the Sky, if
‘--subtractsky’ is called) and account for the extra error due to
it's value (signal). Therefore if the input standard deviation (or
variance) image also contains the contribution of signal to the
error, then the final error measurements will be over-estimated.
‘--stdhdu=STR’
The HDU of the Sky value standard deviation image.
‘--variance’
The dataset given to ‘--instd’ (and ‘--stdhdu’ has the Sky variance
of every pixel, not the Sky standard deviation.
‘--forcereadstd’
Read the input STD image even if it is not required by any of the
requested columns. This is because some of the output catalog's
metadata may need it, for example, to calculate the dataset's
surface brightness limit (see *note Quantifying measurement
limits::, configured with ‘--sfmagarea’ and ‘--sfmagnsigma’ in
*note MakeCatalog output::).
Furthermore, if the input STD image does not have the ‘MEDSTD’
keyword (that is meant to contain the representative standard
deviation of the full image), with this option, the median will be
calculated and used for the surface brightness limit.
‘-z FLT’
‘--zeropoint=FLT’
The zero point magnitude for the input image, see *note Brightness
flux magnitude::.
‘--sigmaclip FLT,FLT’
The sigma-clipping parameters when any of the sigma-clipping
related columns are requested (for example, ‘--sigclip-median’ or
‘--sigclip-number’).
This option takes two values: the first is the multiple of
$\sigma$, and the second is the termination criteria. If the
latter is larger than 1, it is read as an integer number and will
be the number of times to clip. If it is smaller than 1, it is
interpreted as the tolerance level to stop clipping. See *note
Sigma clipping:: for a complete explanation.
‘--frac-max=FLT[,FLT]’
The fractions (one or two) of maximum value in objects or clumps to
be used in the related columns, for example, ‘--frac-max1-area’,
‘--frac-max1-sum’ or ‘--frac-max1-radius’, see *note MakeCatalog
measurements::. For the maximum value, see the description of
‘--maximum’ column below. The value(s) of this option must be
larger than 0 and smaller than 1 (they are a fraction). When only
‘--frac-max1-area’ or ‘--frac-max1-sum’ is requested, one value
must be given to this option, but if ‘--frac-max2-area’ or
‘--frac-max2-sum’ are also requested, two values must be given to
this option. The values can be written as simple floating point
numbers, or as fractions, for example, ‘0.25,0.75’ and ‘0.25,3/4’
are the same.
‘--spatialresolution=FLT’
The error in measuring spatial properties (for example, the area)
in units of pixels. You can think of this as the FWHM of the
dataset's PSF and is used in measurements like the error in surface
brightness (‘--sb-error’, see *note MakeCatalog measurements::).
Ideally, images are taken in the optimal Nyquist sampling *note
Sampling theorem::, so the default value for this option is 2. But
in practice real images my be over-sampled (usually ground-based
images, where you will need to increase the default value) or
undersampled (some space-based images, where you will need to
decrease the default value).
‘--inbetweenints’
Output will contain one row for all integers between 1 and the
largest label in the input (irrespective of their existance in the
input image). By default, MakeCatalog's output will only contain
rows with integers that actually corresponded to at least one pixel
in the input dataset.
For example, if the input's only labeled pixel values are 11 and
13, MakeCatalog's default output will only have two rows. If you
use this option, it will have 13 rows and all the columns
corresponding to integer identifiers that did not correspond to any
pixel will be 0 or NaN (depending on context).
File: gnuastro.info, Node: Upper-limit settings, Next: MakeCatalog output, Prev: MakeCatalog inputs and basic settings, Up: Invoking astmkcatalog
7.4.7.2 Upper-limit settings
............................
The upper-limit magnitude was discussed in *note Quantifying measurement
limits::. Unlike other measured values/columns in MakeCatalog, the
upper limit magnitude needs several extra parameters which are discussed
here. All the options specific to the upper-limit measurements start
with ‘up’ for "upper-limit". The only exception is ‘--envseed’ that is
also present in other programs and is general for any job requiring
random number generation in Gnuastro (see *note Generating random
numbers::).
One very important consideration in Gnuastro is reproducibility.
Therefore, the values to all of these parameters along with others (like
the random number generator type and seed) are also reported in the
comments of the final catalog when the upper limit magnitude column is
desired. The random seed that is used to define the random positions
for each object or clump is unique and set based on the (optionally)
given seed, the total number of objects and clumps and also the labels
of the clumps and objects. So with identical inputs, an identical
upper-limit magnitude will be found. However, even if the seed is
identical, when the ordering of the object/clump labels differs between
different runs, the result of upper-limit measurements will not be
identical.
MakeCatalog will randomly place the object/clump footprint over the
dataset. When the randomly placed footprint does not fall on any object
or masked region (see ‘--upmaskfile’) it will be used in the final
distribution. Otherwise that particular random position will be ignored
and another random position will be generated. Finally, when the
distribution has the desired number of successfully measured random
samples (‘--upnum’) the distribution's properties will be measured and
placed in the catalog.
When the profile is very large or the image is significantly covered
by detections, it might not be possible to find the desired number of
samplings in a reasonable time. MakeProfiles will continue searching
until it is unable to find a successful position (since the last
successful measurement(1)), for a large multiple of ‘--upnum’
(currently(2) this is 10). If ‘--upnum’ successful samples cannot be
found until this limit is reached, MakeCatalog will set the upper-limit
magnitude for that object to NaN (blank).
MakeCatalog will also print a warning if the range of positions
available for the labeled region is smaller than double the size of the
region. In such cases, the limited range of random positions can
artificially decrease the standard deviation of the final distribution.
If your dataset can allow it (it is large enough), it is recommended to
use a larger range if you see such warnings.
‘--upmaskfile=FITS’
File name of mask image to use for upper-limit calculation. In
some cases (especially when doing matched photometry), the object
labels specified in the main input and mask image might not be
adequate. In other words they do not necessarily have to cover
_all_ detected objects: the user might have selected only a few of
the objects in their labeled image. This option can be used to
ignore regions in the image in these situations when estimating the
upper-limit magnitude. All the non-zero pixels of the image
specified by this option (in the ‘--upmaskhdu’ extension) will be
ignored in the upper-limit magnitude measurements.
For example, when you are using labels from another image, you can
give NoiseChisel's objects image output for this image as the value
to this option. In this way, you can be sure that regions with
data do not harm your distribution. See *note Quantifying
measurement limits:: for more on the upper limit magnitude.
‘--upmaskhdu=STR’
The extension in the file specified by ‘--upmask’.
‘--upnum=INT’
The number of random samples to take for all the objects. A larger
value to this option will give a more accurate result
(asymptotically), but it will also slow down the process. When a
randomly positioned sample overlaps with a detected/masked pixel it
is not counted and another random position is found until the
object completely lies over an undetected region. So you can be
sure that for each object, this many samples over undetected
objects are made. See the upper limit magnitude discussion in
*note Quantifying measurement limits:: for more.
‘--uprange=INT,INT’
The range/width of the region (in pixels) to do random sampling
along each dimension of the input image around each object's
position. This is not a mandatory option and if not given (or
given a value of zero in a dimension), the full possible range of
the dataset along that dimension will be used. This is useful when
the noise properties of the dataset vary gradually. In such cases,
using the full range of the input dataset is going to bias the
result. However, note that decreasing the range of available
positions too much will also artificially decrease the standard
deviation of the final distribution (and thus bias the upper-limit
measurement).
‘--envseed’
Read the random number generator type and seed value from the
environment (see *note Generating random numbers::). Random
numbers are used in calculating the random positions of different
samples of each object.
‘--upsigmaclip=FLT,FLT’
The raw distribution of random values will not be used to find the
upper-limit magnitude, it will first be $\sigma$-clipped (see *note
Sigma clipping::) to avoid outliers in the distribution (mainly the
faint undetected wings of bright/large objects in the image). This
option takes two values: the first is the multiple of $\sigma$, and
the second is the termination criteria. If the latter is larger
than 1, it is read as an integer number and will be the number of
times to clip. If it is smaller than 1, it is interpreted as the
tolerance level to stop clipping. See *note Sigma clipping:: for a
complete explanation.
‘--upnsigma=FLT’
The multiple of the final ($\sigma$-clipped) standard deviation (or
$\sigma$) used to measure the upper-limit sum or magnitude.
‘--checkuplim=INT[,INT]’
Print a table of positions and measured values for all the full
random distribution used for one particular object or clump. If
only one integer is given to this option, it is interpreted to be
an object's label. If two values are given, the first is the
object label and the second is the ID of requested clump within it.
The output is a table with three columns (whether it is FITS or
plain-text is determined with the ‘--tableformat’ option, see *note
Input output options::). The first two columns are the pixel X,Y
positions of the center of each label's tile (see next paragraph),
in each random sampling of this particular object/clump. The third
column is the measured flux over that region. If the region
overlapped with a detection or masked pixel, then its measured
value will be a NaN (not-a-number). The total number of rows is
thus unknown before running. However, if an upper-limit
measurement was made in the main output of MakeCatalog, you can be
sure that the number of rows with non-NaN measurements is the
number given to the ‘--upnum’ option.
The "tile" of each label is defined by the minimum and maximum
positions of each label: values of the ‘--min-x’, ‘--max-x’,
‘--min-y’ and ‘--max-y’ columns in the main output table for each
label. Therefore, the tile center position that is recorded in the
output of this column ignores the distribution of labeled pixels
within the tile.
Precise interpretation of the position is only relevant when the
footprint of your label is highly un-symmetrical and you want to
use this catalog to insert your object into the image. In such a
case, you can also ask for ‘--min-x’ and ‘--min-y’ and manually
calculate their difference with the following two positional
measurements of your desired label: ‘--geo-x’ and ‘--geo-y’ (which
report the label's "geometric" center; only using the label
positions ignoring any "values") or ‘--x’ and ‘--y’ (which report
the value-weighted center of the label). Adding the difference
with the position reported by this column, will let you define
alternative "center"s for your label in particular situations (this
will usually not be necessary!). For more on these positional
columns, see *note Position measurements in pixels::.
---------- Footnotes ----------
(1) The counting of failed positions restarts on every successful
measurement.
(2) In Gnuastro's source, this constant number is defined as the
‘MKCATALOG_UPPERLIMIT_MAXFAILS_MULTIP’ macro in ‘bin/mkcatalog/main.h’,
see *note Downloading the source::.
File: gnuastro.info, Node: MakeCatalog output, Prev: Upper-limit settings, Up: Invoking astmkcatalog
7.4.7.3 MakeCatalog output
..........................
After it has completed all the requested measurements (see *note
MakeCatalog measurements::), MakeCatalog will store its measurements in
table(s). If an output filename is given (see ‘--output’ in *note Input
output options::), the format of the table will be deduced from the
name. When it is not given, the input name will be appended with a
‘_cat’ suffix (see *note Automatic output::) and its format will be
determined from the ‘--tableformat’ option, which is also discussed in
*note Input output options::. ‘--tableformat’ is also necessary when
the requested output name is a FITS table (recall that FITS can accept
ASCII and binary tables, see *note Table::).
By default (when ‘--spectrum’ or ‘--clumpscat’ are not called) only a
single catalog/table will be created for the labeled "objects".
• if ‘--clumpscat’ is called, a secondary catalog/table will also be
created for "clumps" (one of the outputs of the Segment program,
for more on "objects" and "clumps", see *note Segment::). In
short, if you only have one labeled image, you do not have to worry
about clumps and just ignore this.
• When ‘--spectrum’ is called, it is not mandatory to specify any
single-valued measurement columns. In this case, the output will
only be the spectra of each labeled region within a 3D datacube.
For more, see the description of ‘--spectrum’ in *note MakeCatalog
measurements::.
When possible, MakeCatalog will also measure the full input's noise
level (also known as surface brightness limit, see *note Quantifying
measurement limits::). Since these measurements are related to the
noise and not any particular labeled object, they are stored as keywords
in the output table. Furthermore, they are only possible when a
standard deviation image has been loaded (done automatically for any
column measurement that involves noise, for example, ‘--sn’,
‘--magnitude-error’ or ‘--sky-std’). But if you just want the surface
brightness limit and no noise-related column, you can use
‘--forcereadstd’. All these keywords start with ‘SBL’ (for "surface
brightness limit") and are described below:
‘SBLSTD’
Per-pixel standard deviation. If a ‘MEDSTD’ keyword exists in the
standard deviation dataset, then that value is directly used.
‘SBLNSIG’
Sigma multiple for surface brightness limit (value you gave to
‘--sfmagnsigma’), used for ‘SBLMAGPX’ and ‘SBLMAG’.
‘SBLMAGPX’
Per-pixel surface brightness limit (in units of magnitudes/pixel).
‘SBLAREA’
Area (in units of arcsec$^2$) used in ‘SBLMAG’ (value you gave to
‘--sfmagarea’).
‘SBLMAG’
Surface brightness limit of data calculated over ‘SBLAREA’ (in
units of mag/arcsec$^2$).
When any of the upper-limit measurements are requested, the input
parameters for the upper-limit measurement are stored in the keywords
starting with ‘UP’: ‘UPNSIGMA’, ‘UPNUMBER’, ‘UPRNGNAM’, ‘UPRNGSEE’,
‘UPSCMLTP’, ‘UPSCTOL’. These are primarily input arguments, so they
correspond to the options with a similar name.
The full list of MakeCatalog's options relating to the output file
format and keywords are listed below. See *note MakeCatalog
measurements:: for specifying which columns you want in the final
catalog.
‘-C’
‘--clumpscat’
Do measurements on clumps and produce a second catalog (only
devoted to clumps). When this option is given, MakeCatalog will
also look for a secondary labeled dataset (identifying
substructure) and produce a catalog from that. For more on the
definition on "clumps", see *note Segment::.
When the output is a FITS file, the objects and clumps
catalogs/tables will be stored as multiple extensions of one FITS
file. You can use *note Table:: to inspect the column meta-data
and contents in this case. However, in plain text format (see
*note Gnuastro text table format::), it is only possible to keep
one table per file. Therefore, if the output is a text file, two
output files will be created, ending in ‘_o.txt’ (for objects) and
‘_c.txt’ (for clumps).
‘--noclumpsort’
Do not sort the clumps catalog based on object ID (only relevant
with ‘--clumpscat’). This option will benefit the performance(1)
of MakeCatalog when it is run on multiple threads _and_ the
position of the rows in the clumps catalog is irrelevant (for
example, you just want the number-counts).
MakeCatalog does all its measurements on each _object_
independently and in parallel. As a result, while it is writing
the measurements on each object's clumps, it does not know how many
clumps there were in previous objects. Each thread will just fetch
the first available row and write the information of clumps (in
order) starting from that row. After all the measurements are
done, by default (when this option is not called), MakeCatalog will
reorder/permute the clumps catalog to have both the object and
clump ID in an ascending order.
If you would like to order the catalog later (when it is a plain
text file), you can run the following command to sort the rows by
object ID (and clump ID within each object), assuming they are
respectively the first and second columns:
$ awk '!/^#/' out_c.txt | sort -g -k1,1 -k2,2
‘--sfmagnsigma=FLT’
Value to multiply with the median standard deviation (from a
‘MEDSTD’ keyword in the Sky standard deviation image) for
estimating the surface brightness limit. Note that the surface
brightness limit is only reported when a standard deviation image
is read, in other words a column using it is requested (for
example, ‘--sn’) or ‘--forcereadstd’ is called.
This value is a per-pixel value, not per object/clump and is not
found over an area or aperture, like the common $5\sigma$ values
that are commonly reported as a measure of depth or the upper-limit
measurements (see *note Quantifying measurement limits::).
‘--sfmagarea=FLT’
Area (in arc-seconds squared) to convert the per-pixel estimation
of ‘--sfmagnsigma’ in the comments section of the output tables.
Note that the surface brightness limit is only reported when a
standard deviation image is read, in other words a column using it
is requested (for example, ‘--sn’) or ‘--forcereadstd’ is called.
Note that this is just a unit conversion using the World Coordinate
System (WCS) information in the input's header. It does not
actually do any measurements on this area. For random measurements
on any area, please use the upper-limit columns of MakeCatalog (see
the discussion on upper-limit measurements in *note Quantifying
measurement limits::).
---------- Footnotes ----------
(1) The performance boost due to ‘--noclumpsort’ can only be felt
when there are a huge number of objects. Therefore, by default the
output is sorted to avoid miss-understandings or bugs in the user's
scripts when the user forgets to sort the outputs.
File: gnuastro.info, Node: Match, Prev: MakeCatalog, Up: Data analysis
7.5 Match
=========
Data can come come from different telescopes, filters, software and even
different configurations for a single software. As a result, one of the
primary things to do after generating catalogs from each of these
sources (for example, with *note MakeCatalog::), is to find which
sources in one catalog correspond to which in the other(s). In other
words, to 'match' the two catalogs with each other.
Gnuastro's Match program is in charge of such operations. The
nearest objects in the two catalogs, within the given aperture, will be
found and given as output. The aperture can be a circle or an ellipse
with any orientation.
* Menu:
* Matching algorithms:: Different ways to find the match
* Invoking astmatch:: Inputs, outputs and options of Match
File: gnuastro.info, Node: Matching algorithms, Next: Invoking astmatch, Prev: Match, Up: Match
7.5.1 Matching algorithms
-------------------------
Matching involves two catalogs, let's call them catalog A (with N rows)
and catalog B (with M rows). The most basic matching algorithm that
immediately comes to mind is this: for each row in A (let's call it
$A_i$), go over all the rows in B ($B_j$, where $0<j<M$) and calculate
the distance $|B_j-A_i|$. If this distance is less than a certain
acceptable distance threshold (or radius, or aperture), consider $A_i$
and $B_j$ as a match.
This basic parsing algorithm is very computationally expensive:
$N\times M$ distances have to measured, and calculating the distance
requires a square root and power of 2: in 2 dimensions it would be
$\sqrt{(B_{ix}-A_{ix})^2+(B_{iy}-A_{iy})^2}$. If an elliptical aperture
is necessary, it can even get more complicated, see *note Defining an
ellipse and ellipsoid::. Such operations are not simple, and will
consume many cycles of your CPU! As a result, this basic algorithm will
become terribly slow as your datasets grow in size. For example, when N
or M exceed hundreds of thousands (which is common in the current days
with datasets like the European Space Agency's Gaia mission). Therefore
that basic parsing algorithm will take too much time and more efficient
ways to _find the nearest neighbor_ need to be found. Gnuastro's Match
currently has algorithms for finding the nearest neighbor:
Sort-based
In this algorithm, we will use a moving window over the sorted
datasets:
1. Sort the two datasets by their first coordinate. Therefore
$A_i<A_j$ (when $i<j$; only in first coordinate), and
similarly, sort the elements of B based on the first
coordinate.
2. Use the radial distance threshold to define the width of a
moving interval over both A and B. Therefore, with a single
parsing of both simultaneously, for each A-point, we can find
all the elements in B that are sufficiently near to it (within
the requested aperture).
This method has some caveats: 1) It requires sorting, which can
again be slow on large numbers. 2) It can only be done on a single
CPU thread! So it cannot benefit from the modern CPUs with many
threads. 3) There is no way to preserve intermediate information
for future matches, for example, this can greatly help when one of
the matched datasets is always the same. To use this sorting
method in Match, use ‘--kdtree=disable’.
k-d tree based
The k-d tree concept is much more abstract, but powerful
(addressing all the caveats of the sort-based method described
above.). In short a k-d tree is a partitioning of a k-dimensional
space ("k" is just a place-holder, so together with "d" for
dimension, "k-d" means "any number of dimensions"!). The k-d tree
of table A is another table with the same number of rows, but only
two integer columns: the integers contain the row indexs (counting
from zero) of the left and right "branch" (in the "tree") of that
row. With a k-d tree we can find the nearest point with much fewer
(statistically) checks, compared to always parsing from the
top-down. For more on the k-d tree concept and Gnuastro's
implementation, please see *note K-d tree::.
When given two catalogs (like the command below), Gnuastro's Match
will internally construct a k-d tree for catalog A (the first
catalog given to it) and use the k-d tree of A, for finding the
nearest B-point(s) to each A-point (this is done in parallel on all
available CPU threads, unless you specify a certain number of
threads to use with ‘--numthreads’, see *note Multi-threaded
operations::)
$ astmatch A.fits --ccol1=ra,dec B.fits --ccol2=RA,DEC \
--aperture=1/3600
However, optionally, you can also build the k-d tree of A and save
it into a file, with a separate call to Match, like below
$ astmatch A.fits --ccol1=ra,dec --kdtree=build \
--output=A-kdtree.fits
This external k-d tree (‘A-kdtree.fits’) can be fed to Match later
(to avoid having to reconstruct it every time you want to match a
new catalog with A) like below for matching both ‘B.fits’ and
‘C.fits’ with ‘A.fits’. Note that the same ‘--kdtree’ option
above, is now given the file name of the k-d tree, instead of
‘build’.
$ astmatch A.fits --ccol1=ra,dec --kdtree=A-kdtree.fits \
B.fits --ccol2=RA,DEC --aperture=1/3600 \
--output=A-B.fits
$ astmatch A.fits --ccol1=ra,dec --kdtree=A-kdtree.fits \
C.fits --ccol2=RA,DEC --aperture=1/3600 \
--output=A-C.fits
Irrespective of how the k-d tree is made ready (by importing or by
constructing internally), it will be used to find the nearest
A-point to each B-point. The k-d tree is parsed independently (on
different CPU threads) for each row of B.
There is just one technical issue however: when there is no
neighbor within the acceptable distance of the k-d tree, it is
forced to parse all elements to confirm that there is no match!
Therefore if one catalog only covers a small portion (in the
coordinate space) of the other catalog, the k-d tree algorithm will
be forced to parse the full k-d tree for the majority of points!
This will dramatically decrease the running speed of Match.
Therefore, Match first divides the range of the first input in all
its dimensions into bins that have a width of the requested
aperture (similar to a histogram), and will only do the k-d tree
based search when the point in catalog B actually falls within a
bin that has at least one element in A.
Above, we described different ways of finding the $A_i$ that is
nearest to each $B_j$. But this is not the whole matching process!
Let's go ahead with a "basic" description of what happens next... You
may be tempted to remove $A_i$ from the search of matches for $B_k$
(where $k>j$). Therefore, as you go down B (and more matches are
found), you have to calculate less distances (there are fewer elements
in A that remain to be checked). However, this will introduce an
important bias: $A_i$ may actually be closer to $B_k$ than to $B_j$!
But because $B_j$ happened to be before $B_k$ in your table, $A_i$ was
removed from the potential search domain of $B_k$. The good match
($B_k$ with $A_i$ will therefore be lost, and replaced by a false match
between $B_j$ and $A_i$!
In a single-dimensional match, this bias depends on the sorting of
your two datasets (leading to different matches if you shuffle your
datasets). But it will get more complex as you add dimensionality. For
example, catalogs derived from 2D images or 3D cubes, where you have 2
and 3 different coordinates for each point.
To address this problem, in Gnuastro (the Match program, or the
matching functions of the library) similar to above, we first parse over
the elements of B. But we will not associate the first nearest-neighbor
with a match! Instead, we will use an array (with the number of rows in
A, let's call it "B-in-A") to keep the list of all nearest element(s) in
B that match each A-point. Once all the points in B are parsed, each
A-point in B-in-A will (possibly) have a sorted list of B-points (there
may be multiple B-points that fall within the acceptable aperture of
each A-point). In the previous example, the $i$ element (corresponding
to $A_i$) of B-in-A will contain the following list of B-points: $B_j$
and $B_k$.
A new array (with the number of points in B, let's call it A-in-B) is
then used to find the final match. We parse over B-in-A (that was
completed above), and from it extract the nearest B-point to each
A-point ($B_k$ for $A_i$ in the example above). If this is the first
A-point that is found for this B-point, then we put this A-point into
A-in-B (in the example above, element $k$ is filled with $A_k$). If
another A-point was previously found for this B-point, then the distance
of the two A-points to that B-point are compared, and the A-point with
the smaller distance is kept in A-in-B. This will give us the best match
between the two catalogs, independent of any sorting issues. Both the
B-in-A and A-in-B will also keep the distances, so distances are only
measured once.
In summary, here are the points to consider when selecting an algorithm,
or the order of your inputs (for optimal speed, the match will be the
same):
• For larger datasets, the k-d tree based method (when running on all
threads possible) is much more faster than the classical sort-based
method.
• The k-d tree is constructed for the first input table and the
multi-threading is done on the rows of the second input table. The
construction of a larger dataset's k-d tree will take longer, but
multi-threading will work better when you have more rows. As a
result, the optimal way to place your inputs is to give the smaller
input table (with fewer rows) as the first argument (so its k-d
tree is constructed), and the larger table as the second argument
(so its rows are checked in parallel).
• If you always need to match against one catalog (that is large!),
the k-d tree construction itself can take a significant fraction of
the running time. Therefore you can save its k-d tree into a file
and simply give it to later calls, like the example given in the
description of the k-d algorithm mentioned above.
File: gnuastro.info, Node: Invoking astmatch, Prev: Matching algorithms, Up: Match
7.5.2 Invoking Match
--------------------
When given two catalogs, Match finds the rows that are nearest to each
other within an input aperture. The executable name is ‘astmatch’ with
the following general template
$ astmatch [OPTION ...] input-1 input-2
One line examples:
## 1D wavelength match (within 5 angstroms) of the two inputs.
## The wavelengths are in the 5th and 10th columns respectively.
$ astmatch --aperture=5e-10 --ccol1=5 --ccol2=10 in1.fits in2.txt
## Find the row that is closest to (RA,DEC) of (12.3456,6.7890)
## with a maximum distance of 1 arcseconds (1/3600 degrees).
## The coordinates can also be given in sexagesimal.
$ astmatch input1.txt --ccol1=ra,dec --coord=12.3456,6.7890 \
--aperture=1/3600
## Find matching rows of two catalogs with a circular aperture
## of width 2 (same unit as position columns: pixels in this case).
$ astmatch input1.txt input2.fits --aperture=2 \
--ccol1=X,Y --ccol2=IMG_X,IMG_Y
## Similar to before, but the output is created by merging various
## columns from the two inputs: columns 1, RA, DEC from the first
## input, followed by all columns starting with `MAG' and the `BRG'
## column from second input and the 10th column from first input.
$ astmatch input1.txt input2.fits --aperture=1/3600 \
--ccol1=ra,dec --ccol2=RAJ2000,DEJ2000 \
--outcols=a1,aRA,aDEC,b/^MAG/,bBRG,a10
## Assuming both inputs have the same column metadata (same name
## and numeric type), the output will contain all the rows of the
## first input, appended with the non-matching rows of the second
## input (good when you need to merge multiple catalogs that
## may have matching items, which you do not want to repeat).
$ astmatch input1.fits input2.fits --ccol1=RA,DEC --ccol2=RA,DEC \
--aperture=1/3600 --notmatched --outcols=_all
## Match the two catalogs within an elliptical aperture of 1 and 2
## arc-seconds along RA and Dec respectively.
$ astmatch --aperture=1/3600,2/3600 in1.fits in2.txt
## Match the RA and DEC columns of the first input with the RA_D
## and DEC_D columns of the second within a 0.5 arcseconds aperture.
$ astmatch --ccol1=RA,DEC --ccol2=RA_D,DEC_D --aperture=0.5/3600 \
in1.fits in2.fits
## Match in 3D (RA, Dec and Wavelength).
$ astmatch --ccol1=2,3,4 --ccol2=2,3,4 -a0.5/3600,0.5/3600,5e-10 \
in1.fits in2.txt
Match will find the rows that are nearest to each other in two
catalogs (given some coordinate columns). Alternatively, it can
construct the k-d tree of one catalog to save in a FITS file for future
matching of the same catalog with many others. To understand the inner
working of Match and its algorithms, see *note Matching algorithms::.
When matching, two catalogs are necessary for input. But for
constructing a k-d tree, only a single catalog should be given. The
input tables can be plain text tables or FITS tables, for more see *note
Tables::. But other ways of feeding inputs area also supported:
• The _first_ catalog can also come from the standard input (for
example, a pipe that feeds the output of a previous command to
Match, see *note Standard input::);
• When you only want to match one point with another catalog, you can
use the ‘--coord’ option to avoid creating a file for the _second_
input catalog.
Match follows the same basic behavior of all Gnuastro programs as
fully described in *note Common program behavior::. If the first input
is a FITS file, the common ‘--hdu’ option (see *note Input output
options::) should be used to identify the extension. When the second
input is FITS, the extension must be specified with ‘--hdu2’.
When ‘--quiet’ is not called, Match will print its various processing
phases (including the number of matches found) in standard output (on
the command-line). When matches are found, by default, two tables will
be output (if in FITS format, as two HDUs). Each output table will
contain the re-arranged rows of the respective input table. In other
words, both tables will have the same number of rows, and row N in both
corresponds to the 10th match between the two. If no matches are found,
the columns of the output table(s) will have zero rows (with proper
meta-data). The output format can be changed with the following
options:
• ‘--outcols’: The output will be a single table with rows chosen
from either of the two inputs in any order.
• ‘--notmatched’: The output tables will contain the rows that did
not match between the two tables. If called with ‘--outcols’, the
output will be a single table with all non-matched rows of both
tables.
• ‘--logasoutput’: The output will be a single table with the
contents of the log file, see below.
If no output file name is given with the ‘--output’ option, then
automatic output *note Automatic output:: will be used to determine the
output name(s). Depending on ‘--tableformat’ (see *note Input output
options::), the output will be a (possibly multi-extension) FITS file or
(possibly two) plain text file(s). Generally, giving a filename to
‘--output’ is recommended.
When the ‘--log’ option is called (see *note Operating mode
options::), and there was a match, Match will also create a file named
‘astmatch.fits’ (or ‘astmatch.txt’, depending on ‘--tableformat’, see
*note Input output options::) in the directory it is run in. This log
table will have three columns. The first and second columns show the
matching row/record number (counting from 1) of the first and second
input catalogs respectively. The third column is the distance between
the two matched positions. The units of the distance are the same as
the given coordinates (given the possible ellipticity, see description
of ‘--aperture’ below). When ‘--logasoutput’ is called, no log file
(with a fixed name) will be created. In this case, the output file
(possibly given by the ‘--output’ option) will have the contents of this
log file.
*‘--log’ is not thread-safe*: As described above, when ‘--logasoutput’
is not called, the Log file has a fixed name for all calls to Match.
Therefore if a separate log is requested in two simultaneous calls to
Match in the same directory, Match will try to write to the same file.
This will cause problems like unreasonable log file, undefined behavior,
or a crash. Remember that ‘--log’ is mainly intended for debugging
purposes, if you want the log file with a specific name, simply use
‘--logasoutput’ (which will also be faster, since no arranging of the
input columns is necessary).
‘-H STR’
‘--hdu2=STR’
The extension/HDU of the second input if it is a FITS file. When
it is not a FITS file, this option's value is ignored. For the
first input, the common option ‘--hdu’ must be used.
‘-k STR’
‘--kdtree=STR’
Select the algorithm and/or the way to construct or import the k-d
tree. A summary of the four acceptable strings for this option are
described here for completeness. However, for a much more detailed
discussion on Match's algorithms with examples, see *note Matching
algorithms::.
‘internal’
Construct a k-d tree for the first input internally (within
the same run of Match), and parallelize over the rows of the
second to find the nearest points. This is the default
algorithm/method used by Match (when this option is not
called).
‘build’
Only construct a k-d tree of a single input and abort. The
name of the k-d tree is value to ‘--output’.
‘CUSTOM-FITS-FILE’
Use the given FITS file as a k-d tree (that was previously
constructed with Match itself) of the first input, and do not
construct any k-d tree internally. The FITS file should have
two columns with an unsigned 32-bit integer data type and a
‘KDTROOT’ keyword that contains the index of the root of the
k-d tree. For more on Gnuastro's k-d tree format, see *note
K-d tree::.
‘disable’
Do not use the k-d tree algorithm for finding the nearest
neighbor, instead, use the sort-based method.
‘--kdtreehdu=STR’
The HDU of the FITS file, when a FITS file is given to the
‘--kdtree’ option that was described above.
‘--outcols=STR[,STR,[...]]’
Columns (from both inputs) to write into a single matched table
output. The value to ‘--outcols’ must be a comma-separated list of
column identifiers (number or name, see *note Selecting table
columns::). The expected format depends on ‘--notmatched’ and
explained below. By default (when ‘--nomatched’ is not called),
the number of rows in the output will be equal to the number of
matches. However, when ‘--notmatched’ is called, all the rows
(from the requested columns) of the first input are placed in the
output, and the not-matched rows of the second input are inserted
afterwards (useful when you want to merge unique entries of
multiple catalogs into one).
Default (only matching rows)
The first character of each string specifies the input
catalog: ‘a’ for the first and ‘b’ for the second. The rest
of the characters of the string will be directly used to
identify the proper column(s) in the respective table. See
*note Selecting table columns:: for how columns can be
specified in Gnuastro.
For example, the output of ‘--outcols=a1,bRA,bDEC’ will have
three columns: the first column of the first input, along with
the ‘RA’ and ‘DEC’ columns of the second input.
If the string after ‘a’ or ‘b’ is ‘_all’, then all the columns
of the respective input file will be written in the output.
For example, the command below will print all the input
columns from the first catalog along with the 5th column from
the second:
$ astmatch a.fits b.fits --outcols=a_all,b5
‘_all’ can be used multiple times, possibly on both inputs.
Tip: if an input's column is called ‘_all’ (an unlikely name!)
and you do not want all the columns from that table the
output, use its column number to avoid confusion.
Another example is given in the one-line examples above.
Compared to the default case (where two tables with all their
columns) are saved separately, using this option is much
faster: it will only read and re-arrange the necessary columns
and it will write a single output table. Combined with
regular expressions in large tables, this can be a very
powerful and convenient way to merge various tables into one.
When ‘--coord’ is given, no second catalog will be read. The
second catalog will be created internally based on the values
given to ‘--coord’. So column names are not defined and you
can only request integer column numbers that are less than the
number of coordinates given to ‘--coord’. For example, if you
want to find the row matching RA of 1.2345 and Dec of 6.7890,
then you should use ‘--coord=1.2345,6.7890’. But when using
‘--outcols’, you cannot give ‘bRA’, or ‘b25’.
With ‘--notmatched’
Only the column names/numbers should be given (for example,
‘--outcols=RA,DEC,MAGNITUDE’). It is assumed that both input
tables have the requested column(s) and that the numerical
data types of each column in each input (with same name) is
the same as the corresponding column in the other. Therefore
if one input has a ‘MAGNITUDE’ column with a 32-bit floating
point type, but the ‘MAGNITUDE’ column of the other is 64-bit
floating point, Match will crash with an error. The metadata
of the columns will come from the first input.
As an example, let's assume ‘input1.txt’ and ‘input2.fits’
each have a different number of columns and rows. However,
they both have the ‘RA’ (64-bit floating point), ‘DEC’ (64-bit
floating point) and ‘MAGNITUDE’ (32-bit floating point)
columns. If ‘input1.txt’ has 100 rows and ‘input2.fits’ has
300 rows (such that 50 of them match within 1 arcsec of the
first), then the output of the command above will have
$100+(300-50)=350$ rows and only three columns. Other columns
in each catalog, which may be different, are ignored.
$ astmatch input1.txt --ccol1=RA,DEC \
input2.fits --ccol2=RA,DEC \
--aperture=1/3600 \
--notmatched --outcols=RA,DEC,MAGNITUDE
‘-l’
‘--logasoutput’
The output file will have the contents of the log file: indexes in
the two catalogs that match with each other along with their
distance, see description of the log file above.
When this option is called, a separate log file will not be created
and the output will not contain any of the input columns (either as
two tables containing the re-arranged columns of each input, or a
single table mixing columns), only their indices in the log format.
‘--notmatched’
Write the non-matching rows into the outputs, not the matched ones.
By default, this will produce two output tables, that will not
necessarily have the same number of rows. However, when called
with ‘--outcols’, it is possible to import non-matching rows of the
second into the first. See the description of ‘--outcols’ for
more.
‘-c INT/STR[,INT/STR]’
‘--ccol1=INT/STR[,INT/STR]’
The coordinate columns of the first input. The number of
dimensions for the match is determined by the number of
comma-separated values given to this option. The values can be the
column number (counting from 1), exact column name or a regular
expression. For more, see *note Selecting table columns::. See
the one-line examples above for some usages of this option.
‘-C INT/STR[,INT/STR]’
‘--ccol2=INT/STR[,INT/STR]’
The coordinate columns of the second input. See the example in
‘--ccol1’ for more.
‘-d FLT[,FLT]’
‘--coord=FLT[,FLT]’
Manually specify the coordinates to match against the given
catalog. With this option, Match will not look for a second input
file/table and will directly use the coordinates given to this
option. When the coordinates are RA and Dec, the comma-separated
values can either be in degrees (a single number), or sexagesimal
(‘_h_m_’ for RA, ‘_d_m_’ for Dec, or ‘_:_:_’ for both).
When this option is called, the output changes in the following
ways: 1) when ‘--outcols’ is specified, for the second input, it
can only accept integer numbers that are less than the number of
values given to this option, see description of that option for
more. 2) By default (when ‘--outcols’ is not used), only the
matching row of the first table will be output (a single file), not
two separate files (one for each table).
This option is good when you have a (large) catalog and only want
to match a single coordinate to it (for example, to find the
nearest catalog entry to your desired point). With this option,
you can write the coordinates on the command-line and thus avoid
the need to make a single-row file.
‘-a FLT[,FLT[,FLT]]’
‘--aperture=FLT[,FLT[,FLT]]’
Parameters of the aperture for matching. The values given to this
option can be fractions, for example, when the position columns are
in units of degrees, ‘1/3600’ can be used to ask for one
arc-second. The interpretation of the values depends on the
requested dimensions (determined from ‘--ccol1’ and ‘--ccol2’) and
how many values are given to this option.
When multiple objects are found within the aperture, the match is
defined as the nearest one. In a multi-dimensional dataset, when
the aperture is a general ellipse or ellipsoid (and not a circle or
sphere), the distance is calculated in the elliptical space along
the major axis. For the defintion of this distance, see $r_{el}$
in *note Defining an ellipse and ellipsoid::.
1D match
The aperture/interval can only take one value: half of the
interval around each point (maximum distance from each point).
2D match
In a 2D match, the aperture can be a circle, an ellipse
aligned in the axes or an ellipse with a rotated major axis.
To simply the usage, you can determine the shape based on the
number of free parameters for each.
1 number
for example, ‘--aperture=2’. The aperture will be a
circle of the given radius. The value will be in the
same units as the columns in ‘--ccol1’ and ‘--ccol2’).
2 numbers
for example, ‘--aperture=3,4e-10’. The aperture will be
an ellipse (if the two numbers are different) with the
respective value along each dimension. The numbers are
in units of the first and second axis. In the example
above, the semi-axis value along the first axis will be 3
(in units of the first coordinate) and along the second
axis will be $4\times10^{-10}$ (in units of the second
coordinate). Such values can happen if you are comparing
catalogs of a spectra for example. If more than one
object exists in the aperture, the nearest will be found
along the major axis as described in *note Defining an
ellipse and ellipsoid::.
3 numbers
for example, ‘--aperture=2,0.6,30’. The aperture will be
an ellipse (if the second value is not 1). The first
number is the semi-major axis, the second is the axis
ratio and the third is the position angle (in degrees).
If multiple matches are found within the ellipse, the
distance (to find the nearest) is calculated along the
major axis in the elliptical space, see *note Defining an
ellipse and ellipsoid::.
3D match
The aperture (matching volume) can be a sphere, an ellipsoid
aligned on the three axes or a genenral ellipsoid rotated in
any direction. To simplifythe usage, the shape can be
determined based on the number of values given to this option.
1 number
for example, ‘--aperture=3’. The matching volume will be
a sphere of the given radius. The value is in the same
units as the input coordinates.
3 numbers
for example, ‘--aperture=4,5,6e-10’. The aperture will
be a general ellipsoid with the respective extent along
each dimension. The numbers must be in the same units as
each axis. This is very similar to the two number case
of 2D inputs. See there for more.
6 numbers
for example, ‘--aperture=4,0.5,0.6,10,20,30’. The
numbers represent the full general ellipsoid definition
(in any orientation). For the definition of a general
ellipsoid, see *note Defining an ellipse and ellipsoid::.
The first number is the semi-major axis. The second and
third are the two axis ratios. The last three are the
three Euler angles in units of degrees in the ZXZ order
as fully described in *note Defining an ellipse and
ellipsoid::.
File: gnuastro.info, Node: Data modeling, Next: High-level calculations, Prev: Data analysis, Up: Top
8 Data modeling
***************
In order to fully understand observations after initial analysis on the
image, it is very important to compare them with the existing models to
be able to further understand both the models and the data. The tools
in this chapter create model galaxies and will provide 2D fittings to be
able to understand the detections.
* Menu:
* MakeProfiles:: Making mock galaxies and stars.
File: gnuastro.info, Node: MakeProfiles, Prev: Data modeling, Up: Data modeling
8.1 MakeProfiles
================
MakeProfiles will create mock astronomical profiles from a catalog,
either individually or together in one output image. In data analysis,
making a mock image can act like a calibration tool, through which you
can test how successfully your detection technique is able to detect a
known set of objects. There are commonly two aspects to detecting: the
detection of the fainter parts of bright objects (which in the case of
galaxies fade into the noise very slowly) or the complete detection of
an over-all faint object. Making mock galaxies is the most accurate
(and idealistic) way these two aspects of a detection algorithm can be
tested. You also need mock profiles in fitting known functional
profiles with observations.
MakeProfiles was initially built for extra galactic studies, so
currently the only astronomical objects it can produce are stars and
galaxies. We welcome the simulation of any other astronomical object.
The general outline of the steps that MakeProfiles takes are the
following:
1. Build the full profile out to its truncation radius in a possibly
over-sampled array.
2. Multiply all the elements by a fixed constant so its total
magnitude equals the desired total magnitude.
3. If ‘--individual’ is called, save the array for each profile to a
FITS file.
4. If ‘--nomerged’ is not called, add the overlapping pixels of all
the created profiles to the output image and abort.
Using input values, MakeProfiles adds the World Coordinate System
(WCS) headers of the FITS standard to all its outputs (except PSF
images!). For a simple test on a set of mock galaxies in one image,
there is no need for the third step or the WCS information.
However in complicated simulations like weak lensing simulations,
where each galaxy undergoes various types of individual transformations
based on their position, those transformations can be applied to the
different individual images with other programs. After all the
transformations are applied, using the WCS information in each
individual profile image, they can be merged into one output image for
convolution and adding noise.
* Menu:
* Modeling basics:: Astronomical modeling basics.
* If convolving afterwards:: Considerations for convolving later.
* Profile magnitude:: Definition of total profile magnitude.
* Invoking astmkprof:: Inputs and Options for MakeProfiles.
File: gnuastro.info, Node: Modeling basics, Next: If convolving afterwards, Prev: MakeProfiles, Up: MakeProfiles
8.1.1 Modeling basics
---------------------
In the subsections below, first a review of some very basic information
and concepts behind modeling a real astronomical image is given. You
can skip this subsection if you are already sufficiently familiar with
these concepts.
* Menu:
* Defining an ellipse and ellipsoid:: Definition of these important shapes.
* PSF:: Radial profiles for the PSF.
* Stars:: Making mock star profiles.
* Galaxies:: Radial profiles for galaxies.
* Sampling from a function:: Sample a function on a pixelated canvas.
* Oversampling:: Oversampling the model.
File: gnuastro.info, Node: Defining an ellipse and ellipsoid, Next: PSF, Prev: Modeling basics, Up: Modeling basics
8.1.1.1 Defining an ellipse and ellipsoid
.........................................
The PSF, see *note PSF::, and galaxy radial profiles are generally
defined on an ellipse. Therefore, in this section we will start
defining an ellipse on a pixelated 2D surface. Labeling the major axis
of an ellipse $a$, and its minor axis with $b$, the _axis ratio_ is
defined as: $q\equiv b/a$. The major axis of an ellipse can be aligned
in any direction, therefore the angle of the major axis with respect to
the horizontal axis of the image is defined to be the _position angle_
of the ellipse and in this book, we show it with $\theta$.
Our aim is to put a radial profile of any functional form $f(r)$ over
an ellipse. Hence we need to associate a radius/distance to every point
in space. Let's define the radial distance $r_{el}$ as the distance on
the major axis to the center of an ellipse which is located at $i_c$ and
$j_c$ (in other words $r_{el}\equiv{a}$). We want to find $r_{el}$ of a
point located at $(i,j)$ (in the image coordinate system) from the
center of the ellipse with axis ratio $q$ and position angle $\theta$.
First the coordinate system is rotated(1) by $\theta$ to get the new
rotated coordinates of that point $(i_r,j_r)$:
$$i_r(i,j)=+(i_c-i)\cos\theta+(j_c-j)\sin\theta$$
$$j_r(i,j)=-(i_c-i)\sin\theta+(j_c-j)\cos\theta$$
Recall that an ellipse is defined by $(i_r/a)^2+(j_r/b)^2=1$ and that we
defined $r_{el}\equiv{a}$. Hence, multiplying all elements of the
ellipse definition with $r_{el}^2$ we get the elliptical distance at
this point point located: $r_{el}=\sqrt{i_r^2+(j_r/q)^2}$. To place the
radial profiles explained below over an ellipse, $f(r_{el})$ is
calculated based on the functional radial profile desired.
An ellipse in 3D, or an ellipsoid
(https://en.wikipedia.org/wiki/Ellipsoid), can be defined following
similar principles as before. Labeling the major (largest) axis length
as $a$, the second and third (in a right-handed coordinate system) axis
lengths can be labeled as $b$ and $c$. Hence we have two axis ratios:
$q_1\equiv{b/a}$ and $q_2\equiv{c/a}$. The orientation of the ellipsoid
can be defined from the orientation of its major axis. There are many
ways to define 3D orientation and order matters. So to be clear, here
we use the ZXZ (or $Z_1X_2Z_3$) proper Euler angles
(https://en.wikipedia.org/wiki/Euler_angles) to define the 3D
orientation. In short, when a point is rotated in this order, we first
rotate it around the Z axis (third axis) by $\alpha$, then about the
(rotated) X axis by $\beta$ and finally about the (rotated) Z axis by
$\gamma$.
Following the discussion in *note Merging multiple warpings::, we can
define the full rotation with the following matrix multiplication.
However, here we are rotating the coordinates, not the point.
Therefore, both the rotation angles and rotation order are reversed. We
are also not using homogeneous coordinates (see *note Linear warping
basics::) since we are not concerned with translation in this context:
$$\left[\matrix{i_r\cr j_r\cr k_r}\right] =
\left[\matrix{cos\gamma&sin\gamma&0\cr -sin\gamma&cos\gamma&0\cr 0&0&1}\right]
\left[\matrix{1&0&0\cr 0&cos\beta&sin\beta\cr 0&-sin\beta&cos\beta }\right]
\left[\matrix{cos\alpha&sin\alpha&0\cr -sin\alpha&cos\alpha&0\cr 0&0&1}\right]
\left[\matrix{i_c-i\cr j_c-j\cr k_c-k}\right] $$
Recall that an ellipsoid can be characterized with
$(i_r/a)^2+(j_r/b)^2+(k_r/c)^2=1$, so similar to before
($r_{el}\equiv{a}$), we can find the ellipsoidal radius at pixel
$(i,j,k)$ as: $r_{el}=\sqrt{i_r^2+(j_r/q_1)^2+(k_r/q_2)^2}$.
MakeProfiles builds the profile starting from the nearest element
(pixel in an image) in the dataset to the profile center. The profile
value is calculated for that central pixel using Monte Carlo
integration, see *note Sampling from a function::. The next pixel is
the next nearest neighbor to the central pixel as defined by $r_{el}$.
This process goes on until the profile is fully built upto the
truncation radius. This is done fairly efficiently using a breadth
first parsing strategy(2) which is implemented through an ordered linked
list.
Using this approach, we build the profile by expanding the
circumference. Not one more extra pixel has to be checked (the
calculation of $r_{el}$ from above is not cheap in CPU terms). Another
consequence of this strategy is that extending MakeProfiles to three
dimensions becomes very simple: only the neighbors of each pixel have to
be changed. Everything else after that (when the pixel index and its
radial profile have entered the linked list) is the same, no matter the
number of dimensions we are dealing with.
---------- Footnotes ----------
(1) Do not confuse the signs of $sin$ with the rotation matrix
defined in *note Linear warping basics::. In that equation, the point
is rotated, here the coordinates are rotated and the point is fixed.
(2) <http://en.wikipedia.org/wiki/Breadth-first_search>
File: gnuastro.info, Node: PSF, Next: Stars, Prev: Defining an ellipse and ellipsoid, Up: Modeling basics
8.1.1.2 Point spread function
.............................
Assume we have a 'point' source, or a source that is far smaller than
the maximum resolution (a pixel). When we take an image of it, it will
'spread' over an area. To quantify that spread, we can define a
'function'. This is how the "point spread function" or the PSF of an
image is defined.
This 'spread' can have various causes, for example, in ground-based
astronomy, due to the atmosphere. In practice we can never surpass the
'spread' due to the diffraction of the telescope aperture (even in
Space!). Various other effects can also be quantified through a PSF.
For example, the simple fact that we are sampling in a discrete space,
namely the pixels, also produces a very small 'spread' in the image.
Convolution is the mathematical process by which we can apply a
'spread' to an image, or in other words blur the image, see *note
Convolution process::. The sum of pixels of an image should remain
unchanged after convolution. Therefore, it is important that the sum of
all the pixels of the PSF be unity. The PSF image also has to have an
odd number of pixels on its sides so one pixel can be defined as the
center.
In MakeProfiles, the PSF can be set by the two methods explained
below:
Parametric functions
A known mathematical function is used to make the PSF. In this
case, only the parameters to define the functions are necessary and
MakeProfiles will make a PSF based on the given parameters for each
function. In both cases, the center of the profile has to be
exactly in the middle of the central pixel of the PSF (which is
automatically done by MakeProfiles). When talking about the PSF,
usually, the full width at half maximum or FWHM is used as a scale
of the width of the PSF.
‘Gaussian’
In the older papers, and to a lesser extent even today, some
researchers use the 2D Gaussian function to approximate the
PSF of ground based images. In its most general form, a
Gaussian function can be written as:
$$f(r)=a \exp \left( -(x-\mu)^2 \over 2\sigma^2 \right)+d$$
Since the center of the profile is pre-defined, $\mu$ and $d$
are constrained. $a$ can also be found because the function
has to be normalized. So the only important parameter for
MakeProfiles is the $\sigma$. In the Gaussian function we
have this relation between the FWHM and $\sigma$:
$$\rm{FWHM}_g=2\sqrt{2\ln{2}}\sigma \approx 2.35482\sigma$$
‘Moffat’
The Gaussian profile is much sharper than the images taken
from stars on photographic plates or CCDs. Therefore in 1969,
Moffat proposed this functional form for the image of stars:
$$f(r)=a \left[ 1+\left( r\over \alpha \right)^2 \right]^{-\beta}$$
Again, $a$ is constrained by the normalization, therefore two
parameters define the shape of the Moffat function: $\alpha$
and $\beta$. The radial parameter is $\alpha$ which is
related to the FWHM by
$$\rm{FWHM}_m=2\alpha\sqrt{2^{1/\beta}-1}$$
Comparing with the PSF predicted from atmospheric turbulence
theory with a Moffat function, Trujillo et al.(1) claim that
$\beta$ should be 4.765. They also show how the Moffat PSF
contains the Gaussian PSF as a limiting case when
$\beta\to\infty$.
An input FITS image
An input image file can also be specified to be used as a PSF. If
the sum of its pixels are not equal to 1, the pixels will be
multiplied by a fraction so the sum does become 1.
Gnuastro has tools to extract the non-parametric (extended) PSF of
any image as a FITS file (assuming there are a sufficient number of
stars in it), see *note Building the extended PSF::. This method
is not perfect (will have noise if you do not have many stars), but
it is the actual PSF of the data that is not forced into any
parametric form.
While the Gaussian is only dependent on the FWHM, the Moffat function
is also dependent on $\beta$. Comparing these two functions with a
fixed FWHM gives the following results:
• Within the FWHM, the functions do not have significant differences.
• For a fixed FWHM, as $\beta$ increases, the Moffat function becomes
sharper.
• The Gaussian function is much sharper than the Moffat functions,
even when $\beta$ is large.
---------- Footnotes ----------
(1) Trujillo, I., J. A. L. Aguerri, J. Cepa, and C. M. Gutierrez
(2001). "The effects of seeing on Sérsic profiles - II. The Moffat
PSF". In: MNRAS 328, pp. 977--985.
File: gnuastro.info, Node: Stars, Next: Galaxies, Prev: PSF, Up: Modeling basics
8.1.1.3 Stars
.............
In MakeProfiles, stars are generally considered to be a point source.
This is usually the case for extra galactic studies, where nearby stars
are also in the field. Since a star is only a point source, we assume
that it only fills one pixel prior to convolution. In fact, exactly for
this reason, in astronomical images the light profiles of stars are one
of the best methods to understand the shape of the PSF and a very large
fraction of scientific research is preformed by assuming the shapes of
stars to be the PSF of the image.
File: gnuastro.info, Node: Galaxies, Next: Sampling from a function, Prev: Stars, Up: Modeling basics
8.1.1.4 Galaxies
................
Today, most practitioners agree that the flux of galaxies can be modeled
with one or a few generalized de Vaucouleur's (or Sérsic) profiles.
$$I(r) = I_e \exp \left ( -b_n \left[ \left( r \over r_e \right)^{1/n} -1 \right] \right )$$
Gérard de Vaucouleurs (1918-1995) was first to show in 1948 that this
function resembles the galaxy light profiles, with the only difference
that he held $n$ fixed to a value of 4. Twenty years later in 1968, J.
L. Sérsic showed that $n$ can have a variety of values and does not
necessarily need to be 4. This profile depends on the effective radius
($r_e$) which is defined as the radius which contains half of the
profile's 2-dimensional integral to infinity (see *note Profile
magnitude::). $I_e$ is the flux at the effective radius. The Sérsic
index $n$ is used to define the concentration of the profile within
$r_e$ and $b_n$ is a constant dependent on $n$. MacArthur et al.(1)
show that for $n>0.35$, $b_n$ can be accurately approximated using this
equation:
$$b_n=2n - {1\over 3} + {4\over 405n} + {46\over 25515n^2} + {131\over 1148175n^3}-{2194697\over 30690717750n^4}$$
---------- Footnotes ----------
(1) MacArthur, L. A., S. Courteau, and J. A. Holtzman (2003).
"Structure of Disk-dominated Galaxies. I. Bulge/Disk Parameters,
Simulations, and Secular Evolution". In: ApJ 582, pp. 689--722.
File: gnuastro.info, Node: Sampling from a function, Next: Oversampling, Prev: Galaxies, Up: Modeling basics
8.1.1.5 Sampling from a function
................................
A pixel is the ultimate level of accuracy to gather data, we cannot get
any more accurate in one image, this is known as sampling in signal
processing. However, the mathematical profiles which describe our
models have infinite accuracy. Over a large fraction of the area of
astrophysically interesting profiles (for example, galaxies or PSFs),
the variation of the profile over the area of one pixel is not too
significant. In such cases, the elliptical radius ($r_{el}$) of the
center of the pixel can be assigned as the final value of the pixel,
(see *note Defining an ellipse and ellipsoid::).
As you approach their center, some galaxies become very sharp (their
value significantly changes over one pixel's area). This sharpness
increases with smaller effective radius and larger Sérsic values. Thus
rendering the central value extremely inaccurate. The first method that
comes to mind for solving this problem is integration. The functional
form of the profile can be integrated over the pixel area in a 2D
integration process. However, unfortunately numerical integration
techniques also have their limitations and when such sharp profiles are
needed they can become extremely inaccurate.
The most accurate method of sampling a continuous profile on a
discrete space is by choosing a large number of random points within the
boundaries of the pixel and taking their average value (or Monte Carlo
integration). This is also, generally speaking, what happens in
practice with the photons on the pixel. The number of random points can
be set with ‘--numrandom’.
Unfortunately, repeating this Monte Carlo process would be extremely
time and CPU consuming if it is to be applied to every pixel. In order
to not loose too much accuracy, in MakeProfiles, the profile is built
using both methods explained below. The building of the profile begins
from its central pixel and continues (radially) outwards. Monte Carlo
integration is first applied (which yields $F_r$), then the central
pixel value ($F_c$) is calculated on the same pixel. If the fractional
difference ($|F_r-F_c|/F_r$) is lower than a given tolerance level
(specified with ‘--tolerance’) MakeProfiles will stop using Monte Carlo
integration and only use the central pixel value.
The ordering of the pixels in this inside-out construction is based
on $r=\sqrt{(i_c-i)^2+(j_c-j)^2}$, not $r_{el}$, see *note Defining an
ellipse and ellipsoid::. When the axis ratios are large (near one) this
is fine. But when they are small and the object is highly elliptical,
it might seem more reasonable to follow $r_{el}$ not $r$. The problem
is that the gradient is stronger in pixels with smaller $r$ (and larger
$r_{el}$) than those with smaller $r_{el}$. In other words, the
gradient is strongest along the minor axis. So if the next pixel is
chosen based on $r_{el}$, the tolerance level will be reached sooner and
lots of pixels with large fractional differences will be missed.
Monte Carlo integration uses a random number of points. Thus, every
time you run it, by default, you will get a different distribution of
points to sample within the pixel. In the case of large profiles, this
will result in a slight difference of the pixels which use Monte Carlo
integration each time MakeProfiles is run. To have a deterministic
result, you have to fix the random number generator properties which is
used to build the random distribution. This can be done by setting the
‘GSL_RNG_TYPE’ and ‘GSL_RNG_SEED’ environment variables and calling
MakeProfiles with the ‘--envseed’ option. To learn more about the
process of generating random numbers, see *note Generating random
numbers::.
The seed values are fixed for every profile: with ‘--envseed’, all
the profiles have the same seed and without it, each will get a
different seed using the system clock (which is accurate to within one
microsecond). The same seed will be used to generate a random number
for all the sub-pixel positions of all the profiles. So in the former,
the sub-pixel points checked for all the pixels undergoing Monte carlo
integration in all profiles will be identical. In other words, the
sub-pixel points in the first (closest to the center) pixel of all the
profiles will be identical with each other. All the second pixels
studied for all the profiles will also receive an identical (different
from the first pixel) set of sub-pixel points and so on. As long as the
number of random points used is large enough or the profiles are not
identical, this should not cause any systematic bias.
File: gnuastro.info, Node: Oversampling, Prev: Sampling from a function, Up: Modeling basics
8.1.1.6 Oversampling
....................
The steps explained in *note Sampling from a function:: do give an
accurate representation of a profile prior to convolution. However, in
an actual observation, the image is first convolved with or blurred by
the atmospheric and instrument PSF in a continuous space and then it is
sampled on the discrete pixels of the camera.
In order to more accurately simulate this process, the unconvolved
image and the PSF are created on a finer pixel grid. In other words,
the output image is a certain odd-integer multiple of the desired size,
we can call this 'oversampling'. The user can specify this multiple as
a command-line option. The reason this has to be an odd number is that
the PSF has to be centered on the center of its image. An image with an
even number of pixels on each side does not have a central pixel.
The image can then be convolved with the PSF (which should also be
oversampled on the same scale). Finally, image can be sub-sampled to
get to the initial desired pixel size of the output image. After this,
mock noise can be added as explained in the next section. This is
because unlike the PSF, the noise occurs in each output pixel, not on a
continuous space like all the prior steps.
File: gnuastro.info, Node: If convolving afterwards, Next: Profile magnitude, Prev: Modeling basics, Up: MakeProfiles
8.1.2 If convolving afterwards
------------------------------
In case you want to convolve the image later with a given point spread
function, make sure to use a larger image size. After convolution, the
profiles become larger and a profile that is normally completely outside
of the image might fall within it.
On one axis, if you want your final (convolved) image to be $m$
pixels and your PSF is $2n+1$ pixels wide, then when calling
MakeProfiles, set the axis size to $m+2n$, not $m$. You also have to
shift all the pixel positions of the profile centers on the that axis by
$n$ pixels to the positive.
After convolution, you can crop the outer $n$ pixels with the section
crop box specification of Crop: ‘--section=n+1:*-n,n+1:*-n’ (according
to the FITS standard, counting is from 1 so we use ‘n+1’) assuming your
PSF is a square, see *note Crop section syntax::. This will also remove
all discrete Fourier transform artifacts (blurred sides) from the final
image. To facilitate this shift, MakeProfiles has the options
‘--xshift’, ‘--yshift’ and ‘--prepforconv’, see *note Invoking
astmkprof::.
File: gnuastro.info, Node: Profile magnitude, Next: Invoking astmkprof, Prev: If convolving afterwards, Up: MakeProfiles
8.1.3 Profile magnitude
-----------------------
To find the profile's total magnitude, (see *note Brightness flux
magnitude::), it is customary to use the 2D integration of the flux to
infinity. However, in MakeProfiles we do not follow this idealistic
approach and apply a more realistic method to find the total magnitude:
the sum of all the pixels belonging to a profile within its predefined
truncation radius. Note that if the truncation radius is not large
enough, this can be significantly different from the total integrated
light to infinity.
An integration to infinity is not a realistic condition because no
galaxy extends indefinitely (important for high Sérsic index profiles),
pixelation can also cause a significant difference between the actual
total pixel sum value of the profile and that of integration to
infinity, especially in small and high Sérsic index profiles. To be
safe, you can specify a large enough truncation radius for such compact
high Sérsic index profiles.
If oversampling is used then the pixel value is calculated using the
over-sampled image, see *note Oversampling:: which is much more
accurate. The profile is first built in an array completely bounding it
with a normalization constant of unity (see *note Galaxies::). Taking
$V$ to be the desired pixel value and $S$ to be the sum of the pixels in
the created profile, every pixel is then multiplied by $V/S$ so the sum
is exactly $V$.
If the ‘--individual’ option is called, this same array is written to
a FITS file. If not, only the overlapping pixels of this array and the
output image are kept and added to the output array.
File: gnuastro.info, Node: Invoking astmkprof, Prev: Profile magnitude, Up: MakeProfiles
8.1.4 Invoking MakeProfiles
---------------------------
MakeProfiles will make any number of profiles specified in a catalog
either individually or in one image. The executable name is ‘astmkprof’
with the following general template
$ astmkprof [OPTION ...] [Catalog]
One line examples:
## Make an image with profiles in catalog.txt (with default size):
$ astmkprof catalog.txt
## Make the profiles in catalog.txt over image.fits:
$ astmkprof --background=image.fits catalog.txt
## Make a Moffat PSF with FWHM 3pix, beta=2.8, truncation=5
$ astmkprof --kernel=moffat,3,2.8,5 --oversample=1
## Make profiles in catalog, using RA and Dec in the given column:
$ astmkprof --ccol=RA_CENTER --ccol=DEC_CENTER --mode=wcs catalog.txt
## Make a 1500x1500 merged image (oversampled 500x500) image along
## with an individual image for all the profiles in catalog:
$ astmkprof --individual --oversample 3 --mergedsize=500,500 cat.txt
The parameters of the mock profiles can either be given through a
catalog (which stores the parameters of many mock profiles, see *note
MakeProfiles catalog::), or the ‘--kernel’ option (see *note
MakeProfiles output dataset::). The catalog can be in the FITS ASCII,
FITS binary format, or plain text formats (see *note Tables::). A plain
text catalog can also be provided using the Standard input (see *note
Standard input::). The columns related to each parameter can be
determined both by number, or by match/search criteria using the column
names, units, or comments, with the options ending in ‘col’, see below.
Without any file given to the ‘--background’ option, MakeProfiles
will make a zero-valued image and build the profiles on that (its size
and main WCS parameters can also be defined through the options
described in *note MakeProfiles output dataset::). Besides the
main/merged image containing all the profiles in the catalog, it is also
possible to build individual images for each profile (only enclosing one
full profile to its truncation radius) with the ‘--individual’ option.
If an image is given to the ‘--background’ option, the pixels of that
image are used as the background value for every pixel hence flux value
of each profile pixel will be added to the pixel in that background
value. You can disable this with the ‘--clearcanvas’ option (which will
initialize the background to zero-valued pixels and build profiles over
that). With the ‘--background’ option, the values to all options
relating to the "canvas" (output size and WCS) will be ignored if
specified: ‘--oversample’, ‘--mergedsize’, ‘--prepforconv’, ‘--crpix’,
‘--crval’, ‘--cdelt’, ‘--cdelt’, ‘--pc’, ‘cunit’ and ‘ctype’.
The sections below discuss the options specific to MakeProfiles based
on context: the input catalog settings which can have many rows for
different profiles are discussed in *note MakeProfiles catalog::, in
*note MakeProfiles profile settings::, we discuss how you can set
general profile settings (that are the same for all the profiles in the
catalog). Finally *note MakeProfiles output dataset:: and *note
MakeProfiles log file:: discuss the outputs of MakeProfiles and how you
can configure them. Besides these, MakeProfiles also supports all the
common Gnuastro program options that are discussed in *note Common
options::, so please flip through them is well for a more comfortable
usage.
When building 3D profiles, there are more degrees of freedom. Hence,
more columns are necessary and all the values related to dimensions (for
example, size of dataset in each dimension and the WCS properties) must
also have 3 values. To allow having an independent set of default
values for creating 3D profiles, MakeProfiles also installs a
‘astmkprof-3d.conf’ configuration file (see *note Configuration
files::). You can use this for default 3D profile values. For example,
if you installed Gnuastro with the prefix ‘/usr/local’ (the default
location, see *note Installation directory::), you can benefit from this
configuration file by running MakeProfiles like the example below. As
with all configuration files, if you want to customize a given option,
call it before the configuration file.
$ astmkprof --config=/usr/local/etc/astmkprof-3d.conf catalog.txt
To further simplify the process, you can define a shell alias in any
startup file (for example, ‘~/.bashrc’, see *note Installation
directory::). Assuming that you installed Gnuastro in ‘/usr/local’, you
can add this line to the startup file (you may put it all in one line,
it is broken into two lines here for fitting within page limits).
alias astmkprof-3d="astmkprof --config=/usr/local/etc/astmkprof-3d.conf"
Using this alias, you can call MakeProfiles with the name ‘astmkprof-3d’
(instead of ‘astmkprof’). It will automatically load the 3D specific
configuration file first, and then parse any other arguments, options or
configuration files. You can change the default values in this 3D
configuration file by calling them on the command-line as you do with
‘astmkprof’(1).
Please see *note Sufi simulates a detection:: for a very complete
tutorial explaining how one could use MakeProfiles in conjunction with
other Gnuastro's programs to make a complete simulated image of a mock
galaxy.
* Menu:
* MakeProfiles catalog:: Required catalog properties.
* MakeProfiles profile settings:: Configuration parameters for all profiles.
* MakeProfiles output dataset:: The canvas/dataset to build profiles over.
* MakeProfiles log file:: A description of the optional log file.
---------- Footnotes ----------
(1) Recall that for single-invocation options, the last command-line
invocation takes precedence over all previous invocations (including
those in the 3D configuration file). See the description of ‘--config’
in *note Operating mode options::.
File: gnuastro.info, Node: MakeProfiles catalog, Next: MakeProfiles profile settings, Prev: Invoking astmkprof, Up: Invoking astmkprof
8.1.4.1 MakeProfiles catalog
............................
The catalog containing information about each profile can be in the FITS
ASCII, FITS binary, or plain text formats (see *note Tables::). The
latter can also be provided using standard input (see *note Standard
input::). Its columns can be ordered in any desired manner. You can
specify which columns belong to which parameters using the set of
options discussed below. For example, through the ‘--rcol’ and ‘--tcol’
options, you can specify the column that contains the radial parameter
for each profile and its truncation respectively. See *note Selecting
table columns:: for a thorough discussion on the values to these
options.
The value for the profile center in the catalog (the ‘--ccol’ option)
can be a floating point number so the profile center can be on any
sub-pixel position. Note that pixel positions in the FITS standard
start from 1 and an integer is the pixel center. So a 2D image actually
starts from the position (0.5, 0.5), which is the bottom-left corner of
the first pixel. When a ‘--background’ image with WCS information is
provided, or you specify the WCS parameters with the respective
options(1), you may also use RA and Dec to identify the center of each
profile (see the ‘--mode’ option below).
In MakeProfiles, profile centers do not have to be in (overlap with)
the final image. Even if only one pixel of the profile within the
truncation radius overlaps with the final image size, the profile is
built and included in the final image. Profiles that are completely out
of the image will not be created (unless you explicitly ask for it with
the ‘--individual’ option). You can use the output log file (created
with ‘--log’ to see which profiles were within the image, see *note
Common options::.
If PSF profiles (Moffat or Gaussian, see *note PSF::) are in the
catalog and the profiles are to be built in one image (when
‘--individual’ is not used), it is assumed they are the PSF(s) you want
to convolve your created image with. So by default, they will not be
built in the output image but as separate files. The sum of pixels of
these separate files will also be set to unity (1) so you are ready to
convolve, see *note Convolution process::. As a summary, the position
and magnitude of PSF profile will be ignored. This behavior can be
disabled with the ‘--psfinimg’ option. If you want to create all the
profiles separately (with ‘--individual’) and you want the sum of the
PSF profile pixels to be unity, you have to set their magnitudes in the
catalog to the zero point magnitude and be sure that the central
positions of the profiles do not have any fractional part (the PSF
center has to be in the center of the pixel).
The list of options directly related to the input catalog columns is
shown below.
‘--ccol=STR/INT’
Center coordinate column for each dimension. This option must be
called two times to define the center coordinates in an image. For
example, ‘--ccol=RA’ and ‘--ccol=DEC’ (along with ‘--mode=wcs’)
will inform MakeProfiles to look into the catalog columns named
‘RA’ and ‘DEC’ for the Right Ascension and Declination of the
profile centers.
‘--fcol=INT/STR’
The functional form of the profile with one of the values below
depending on the desired profile. The column can contain either
the numeric codes (for example, '‘1’') or string characters (for
example, '‘sersic’'). The numeric codes are easier to use in
scripts which generate catalogs with hundreds or thousands of
profiles.
The string format can be easier when the catalog is to be
written/checked by hand/eye before running MakeProfiles. It is
much more readable and provides a level of documentation. All
Gnuastro's recognized table formats (see *note Recognized table
formats::) accept string type columns. To have string columns in a
plain text table/catalog, see *note Gnuastro text table format::.
• Sérsic profile with '‘sersic’' or '‘1’'.
• Moffat profile with '‘moffat’' or '‘2’'.
• Gaussian profile with '‘gaussian’' or '‘3’'.
• Point source with '‘point’' or '‘4’'.
• Flat profile with '‘flat’' or '‘5’'.
• Circumference profile with '‘circum’' or '‘6’'. A fixed value
will be used for all pixels less than or equal to the
truncation radius ($r_t$) and greater than $r_t-w$ ($w$ is the
value to the ‘--circumwidth’).
• Radial distance profile with '‘distance’' or '‘7’'. At the
lowest level, each pixel only has an elliptical radial
distance given the profile's shape and orientation (see *note
Defining an ellipse and ellipsoid::). When this profile is
chosen, the pixel's elliptical radial distance from the
profile center is written as its value. For this profile, the
value in the magnitude column (‘--mcol’) will be ignored.
You can use this for checks or as a first approximation to
define your own higher-level radial function. In the latter
case, just note that the central values are going to be
incorrect (see *note Sampling from a function::).
• Custom radial profile with '‘custom-prof’' or '‘8’'. The
values to use for each radial interval should be in the table
given to ‘--customtable’. By default, once the profile is
built with the given values, it will be scaled to have a total
magnitude that you have requested in the magnitude column of
the profile (in ‘--mcol’). If you want the raw values in the
2D profile (to ignore the magnitude column), use
‘--mcolnocustprof’. For more, see the description of
‘--customtable’ in *note MakeProfiles profile settings::.
• Azimuthal angle profile with '‘azimuth’' or '‘9’'. Every
pixel within the truncation radius will be given its azimuthal
angle (in degrees, from 0 to 360) from the major axis. In
combination with the radial distance profile, you can now
create complex features in polar coordinates, such as tidal
tails or tidal shocks (using the Arithmetic program to mix the
radius and azimuthal angle through a function to create your
desired features).
• Custom image with '‘custom-img’' or '‘10’'. The image(s) to
use should be given to the ‘--customimg’ option (which can be
called multiple times for multiple images). To identify which
one of the images (given to ‘--customimg’) should be used, you
should specify their counter in the "radius" column below.
For more, see the description of ‘custom-img’ in *note
MakeProfiles profile settings::.
‘--rcol=STR/INT’
The radius parameter of the profiles. Effective radius ($r_e$) if
Sérsic, FWHM if Moffat or Gaussian.
For a custom image profile, this option is not interpreted as a
radius, but as a counter (identifying which one of the images given
to ‘--customimg’ should be used for each row).
‘--ncol=STR/INT’
The Sérsic index ($n$) or Moffat $\beta$.
‘--pcol=STR/INT’
The position angle (in degrees) of the profiles relative to the
first FITS axis (horizontal when viewed in SAO DS9). When building
a 3D profile, this is the first Euler angle: first rotation of the
ellipsoid major axis from the first FITS axis (rotating about the
third axis). See *note Defining an ellipse and ellipsoid::.
‘--p2col=STR/INT’
Second Euler angle (in degrees) when building a 3D ellipsoid. This
is the second rotation of the ellipsoid major axis (following
‘--pcol’) about the (rotated) X axis. See *note Defining an
ellipse and ellipsoid::. This column is ignored when building a 2D
profile.
‘--p3col=STR/INT’
Third Euler angle (in degrees) when building a 3D ellipsoid. This
is the third rotation of the ellipsoid major axis (following
‘--pcol’ and ‘--p2col’) about the (rotated) Z axis. See *note
Defining an ellipse and ellipsoid::. This column is ignored when
building a 2D profile.
‘--qcol=STR/INT’
The axis ratio of the profiles (minor axis divided by the major
axis in a 2D ellipse). When building a 3D ellipse, this is the
ratio of the major axis to the semi-axis length of the second
dimension (in a right-handed coordinate system). See $q1$ in *note
Defining an ellipse and ellipsoid::.
‘--q2col=STR/INT’
The ratio of the ellipsoid major axis to the third semi-axis length
(in a right-handed coordinate system) of a 3D ellipsoid. See $q1$
in *note Defining an ellipse and ellipsoid::. This column is
ignored when building a 2D profile.
‘--mcol=STR/INT’
The total pixelated magnitude of the profile within the truncation
radius, see *note Profile magnitude::.
‘--tcol=STR/INT’
The truncation radius of this profile. By default it is in units
of the radial parameter of the profile (the value in the ‘--rcol’
of the catalog). If ‘--tunitinp’ is given, this value is
interpreted in units of pixels (prior to oversampling) irrespective
of the profile.
---------- Footnotes ----------
(1) The options to set the WCS are the following: ‘--crpix’,
‘--crval’, ‘--cdelt’, ‘--cdelt’, ‘--pc’, ‘cunit’ and ‘ctype’. Just
recall that these options are only used if ‘--background’ is not given:
if the image you give to ‘--background’ does not have WCS, these options
will not be used and you cannot use WCS-mode coordinates like RA or Dec.
File: gnuastro.info, Node: MakeProfiles profile settings, Next: MakeProfiles output dataset, Prev: MakeProfiles catalog, Up: Invoking astmkprof
8.1.4.2 MakeProfiles profile settings
.....................................
The profile parameters that differ between each created profile are
specified through the columns in the input catalog and described in
*note MakeProfiles catalog::. Besides those there are general settings
for some profiles that do not differ between one profile and another,
they are a property of the general process. For example, how many
random points to use in the monte-carlo integration, this value is fixed
for all the profiles. The options described in this section are for
configuring such properties.
‘--mode=STR’
Interpret the center position columns (‘--ccol’ in *note
MakeProfiles catalog::) in image or WCS coordinates. This option
thus accepts only two values: ‘img’ and ‘wcs’. It is mandatory
when a catalog is being used as input.
‘-r’
‘--numrandom’
The number of random points used in the central regions of the
profile, see *note Sampling from a function::.
‘-e’
‘--envseed’
Use the value to the ‘GSL_RNG_SEED’ environment variable to
generate the random Monte Carlo sampling distribution, see *note
Sampling from a function:: and *note Generating random numbers::.
‘-t FLT’
‘--tolerance=FLT’
The tolerance to switch from Monte Carlo integration to the central
pixel value, see *note Sampling from a function::.
‘-p’
‘--tunitinp’
The truncation column of the catalog is in units of pixels. By
default, the truncation column is considered to be in units of the
radial parameters of the profile (‘--rcol’). Read it as
't-unit-in-p' for 'truncation unit in pixels'.
‘-f’
‘--mforflatpix’
When making fixed value profiles ("flat", "circumference" or
"point" profiles, see '‘--fcol’'), do not use the value in the
column specified by '‘--mcol’' as the magnitude. Instead use it as
the exact value that all the pixels of these profiles should have.
This option is irrelevant for other types of profiles. This option
is very useful for creating masks, or labeled regions in an image.
Any integer, or floating point value can used in this column with
this option, including ‘NaN’ (or '‘nan’', or '‘NAN’', case is
irrelevant), and infinities (‘inf’, ‘-inf’, or ‘+inf’).
For example, with this option if you set the value in the magnitude
column (‘--mcol’) to ‘NaN’, you can create an elliptical or
circular mask over an image (which can be given as the argument),
see *note Blank pixels::. Another useful application of this
option is to create labeled elliptical or circular apertures in an
image. To do this, set the value in the magnitude column to the
label you want for this profile. This labeled image can then be
used in combination with NoiseChisel's output (see *note
NoiseChisel output::) to do aperture photometry with MakeCatalog
(see *note MakeCatalog::).
Alternatively, if you want to mark regions of the image (for
example, with an elliptical circumference) and you do not want to
use NaN values (as explained above) for some technical reason, you
can get the minimum or maximum value in the image (1) using
Arithmetic (see *note Arithmetic::), then use that value in the
magnitude column along with this option for all the profiles.
Please note that when using MakeProfiles on an already existing
image, you have to set '‘--oversample=1’'. Otherwise all the
profiles will be scaled up based on the oversampling scale in your
configuration files (see *note Configuration files::) unless you
have accounted for oversampling in your catalog.
‘--mcolissum’
The value given in the "magnitude" column (specified by ‘--mcol’,
see *note MakeProfiles catalog::) must be interpreted as total sum
of pixel values, not magnitude (which is measured from the total
sum and zero point, see *note Brightness flux magnitude::). When
this option is called, the zero point magnitude (value to the
‘--zeropoint’ option) is ignored and the given value must have the
same units as the input dataset's pixels.
Recall that the total profile magnitude that is specified with in
the ‘--mcol’ column of the input catalog is not an integration to
infinity, but the actual sum of pixels in the profile (until the
desired truncation radius). See *note Profile magnitude:: for more
on this point.
‘--mcolnocustprof’
Do not touch (re-scale) the custom profile that should be inserted
in ‘custom-prof’ profile (see the description of ‘--fcol’ in *note
MakeProfiles catalog:: or the description of ‘--customtable’
below). By default, MakeProfiles will scale (multiply) the custom
image's pixels to have the desired magnitude (or sum of pixels if
‘--mcolissum’ is called) in that row.
‘--mcolnocustimg’
Do not touch (re-scale) the custom image that should be inserted in
‘custom-img’ profile (see the description of ‘--fcol’ in *note
MakeProfiles catalog::). By default, MakeProfiles will scale
(multiply) the custom image's pixels to have the desired magnitude
(or sum of pixels if ‘--mcolissum’ is called) in that row.
‘--magatpeak’
The magnitude column in the catalog (see *note MakeProfiles
catalog::) will be used to set the value only for the profile's
peak (maximum) pixel, not the full profile. Note that this is the
flux of the profile's peak (maximum) pixel in the final output of
MakeProfiles. So beware of the oversampling, see *note
Oversampling::.
This option can be useful if you want to check a mock profile's
total magnitude at various truncation radii. Without this option,
no matter what the truncation radius is, the total magnitude will
be the same as that given in the catalog. But with this option,
the total magnitude will become brighter as you increase the
truncation radius.
In sharper profiles, sometimes the accuracy of measuring the peak
profile flux is more than the overall object sum or magnitude. In
such cases, with this option, the final profile will be built such
that its peak has the given magnitude, not the total profile.
*CAUTION:* If you want to use this option for comparing with
observations, please note that MakeProfiles does not do
convolution. Unless you have deconvolved your data, your images
are convolved with the instrument and atmospheric PSF, see *note
PSF::. Particularly in sharper profiles, the flux in the peak
pixel is strongly decreased after convolution. Also note that in
such cases, besides deconvolution, you will have to set
‘--oversample=1’ otherwise after resampling your profile with Warp
(see *note Warp::), the peak flux will be different.
‘--customtable FITS/TXT’
The filename of the table to use in the custom radial profiles (see
description of ‘--fcol’ in *note MakeProfiles catalog::. This can
be a plain-text table, or FITS table, see *note Recognized table
formats::, if it is a FITS table, you can use ‘--customtablehdu’ to
specify which HDU should be used (described below).
A custom radial profile can have any value you want for a given
radial profile (including NaN/blank values). Each interval is
defined by its minimum (inclusive) and maximum (exclusive) radius,
when a pixel center falls within a radius interval, the value
specified for that interval will be used. If a pixel is not in the
given intervals, a value of 0.0 will be used for that pixel.
The table should have 3 columns as shown below. If the intervals
are contiguous (the maximum value of the previous interval is equal
to the minimum value of an interval) and the intervals all have the
same size (difference between minimum and maximum values) the
creation of these profiles will be fast. However, if the intervals
are not sorted and contiguous, MakeProfiles will parse the
intervals from the top of the table and use the first interval that
contains the pixel center (this may slow it down).
Column 1:
The interval's minimum radius.
Column 2:
The interval's maximum radius.
Column 3:
The value to be used for pixels within the given interval
(including NaN/blank).
Gnuastro's column arithmetic in the Table program has the
‘sorted-to-interval’ operator that will generate the first two
columns from a single column (your radial profile). See the
description of that operator in *note Column arithmetic:: and the
example below.
By default, once a 2D image is constructed for the radial profile,
it will be scaled such that its total magnitude corresponds to the
value in the magnitude column (‘--mcol’) of the main input catalog.
If you want to disable the scaling and use the raw values in your
custom profile (in other words: you want to ignore the magnitude
column) you need to call ‘--mcolnocustprof’ (see above).
In the example below, we'll start with a certain radial profile,
and use this option to build its 2D representation in an image
(recall that you can build radial profiles with *note Generate
radial profile::). But first, we will need to use the
‘sorted-to-interval’ to build the necessary input format (see *note
Column arithmetic::).
$ cat radial.txt
# Column 1: RADIUS [pix ,f32,] Radial distance
# Column 2: MEAN [input-units,f32,] Mean of values.
0.0 1.00000
1.0 0.50184
1.4 0.37121
2.0 0.26414
2.2 0.23427
2.8 0.17868
3.0 0.16627
3.1 0.15567
3.6 0.13132
4.0 0.11404
## Convert the radius in each row to an interval
$ asttable radial.txt --output=interval.fits \
-c'arith RADIUS sorted-to-interval',MEAN
## Inspect the table containing intervals
$ asttable interval.fits -ffixed
-0.500000 0.500000 1.000000
0.500000 1.200000 0.501840
1.200000 1.700000 0.371210
1.700000 2.100000 0.264140
2.100000 2.500000 0.234270
2.500000 2.900000 0.178680
2.900000 3.050000 0.166270
3.050000 3.350000 0.155670
3.350000 3.800000 0.131320
3.800000 4.200000 0.114040
## Build the 2D image of the profile from the interval.
$ echo "1 7 7 8 10 2.5 0 1 1 2" \
| astmkprof --mergedsize=13,13 --oversample=1 \
--customtable=interval.fits \
--output=image.fits
## View the created FITS image.
$ astscript-fits-view image.fits --ds9scale=minmax
Recall that if you want your image pixels to have the same values
as the ‘MEAN’ column in your profile, you should run MakeProfiles
with ‘--mcolnocustprof’.
In case you want to build the profile using *note Generate radial
profile::, be sure to use the ‘--oversample’ option of
‘astscript-radial-profile’. The higher the oversampling, the
better your result will be. For example you can run the following
script to see the effect (also see bug 65106
(https://savannah.gnu.org/bugs/?65106)). But don't take the
oversampling too high: both the radial profile script and
MakeProfiles will become slower and the precision of your results
will decrease.
#!/bin/bash
# Function to avoid repeating code: first generate a radial profile
# with a certain oversampling, then build a 2D profile from it):
# The first argument is the oversampling, the second is the suffix.
gen_rad_make_2dprf () {
# Generate the radial profile
radraw=$bdir/radial-profile-$2.fits
astscript-radial-profile $prof -o$radraw \
--oversample=$1 \
--zeroisnotblank
# Generate the custom table format
custraw=$bdir/customtable-$2.fits
asttable $radraw --output=interval.fits \
-c'arith RADIUS sorted-to-interval',MEAN \
-o$custraw
# Build the 2D profile.
prof2draw=$bdir/prof2d-$2.fits
echo "1 $xc $yc 8 30 0 0 1 0 1" \
| astmkprof --customtable=$custraw \
--mergedsize=$xw,$yw \
--output=$prof2draw \
--mcolnocustprof \
--oversample=1 \
--clearcanvas \
--mode=img
}
# Directory to hold built files
bdir=build
if ! [ -d $bdir ]; then mkdir $bdir; fi
# Build a Gaussian profile in the center of an image to start with.
prof=$bdir/prof.fits
astmkprof --kernel=gaussian,2,5 -o$prof
# Find the center pixel of the image
xw=$(astfits $prof --keyvalue=NAXIS1 --quiet)
yw=$(astfits $prof --keyvalue=NAXIS2 --quiet)
xc=$(echo $xw | awk '{print int($1/2)+1}')
yc=$(echo $yw| awk '{print int($1/2)+1}')
# Generate two 2D radial profiles, one with an oversampling of 1
# and another with an oversampling of 5.
gen_rad_make_2dprf 1 "raw"
gen_rad_make_2dprf 5 "oversample"
# View the two images beside each other:
astscript-fits-view $bdir/prof2d-raw.fits \
$bdir/prof2d-oversample.fits
‘--customtablehdu INT/STR’
The HDU/extension in the FITS file given to ‘--customtable’.
‘--customimg=STR[,STR]’
A custom FITS image that should be used for the ‘custom-img’
profiles (see the description of ‘--fcol’ in *note MakeProfiles
catalog::). Multiple files can be given to this option (separated
by a comma), and this option can be called multiple times itself
(useful when many custom image profiles should be added). If the
HDU of the images are different, you can use ‘--customimghdu’
(described below).
Through the "radius" column, MakeProfiles will know which one of
the images given to this option should be used in each row. For
example, let's assume your input catalog (‘cat.fits’) has the
following contents (output of first command below), and you call
MakeProfiles like the second command below to insert four profiles
into the background ‘back.fits’ image.
The first profile below is Sersic (with an ‘--fcol’, or 4-th
column, code of ‘1’). So MakeProfiles builds the pixels of the
first profile, and all column values are meaningful. However, the
second, third and fourth inserted objects are custom images (with
an ‘--fcol’ code of ‘10’). For the custom image profiles, you see
that the radius column has values of ‘1’ or ‘2’. This tells
MakeProfiles to use the first image given to ‘--customimg’ (or
‘gal-1.fits’) for the second and fourth inserted objects. The
second image given to ‘--customimage’ (or ‘gal-2.fits’) will be
used for the third inserted object. Finally, all three custom
image profiles have different magnitudes, and the values in
‘--ncol’, ‘--pcol’, ‘--qcol’ and ‘--tcol’ are ignored.
$ cat cat.fits
1 53.15506 -27.785165 1 20 1 20 0.6 25 5
2 53.15602 -27.777887 10 1 0 0 0 22 0
3 53.16440 -27.775876 10 2 0 0 0 24 0
4 53.16849 -27.787406 10 1 0 0 0 23 0
$ astmkprof cat.fits --mode=wcs --zeropoint=25.68 \
--background=back.fits --output=out.fits \
--customimg=gal-1.fits --customimg=gal-2.fits
‘--customimghdu=INT/STR’
The HDU(s) of the images given to ‘--customimghdu’. If this option
is only called once, but ‘--customimg’ is called many times,
MakeProfiles will assume that all images given to ‘--customimg’
have the same HDU. Otherwise (if the number of HDUs is equal to the
number of images), then each image will use its corresponding HDU.
‘-X INT,INT’
‘--shift=INT,INT’
Shift all the profiles and enlarge the image along each dimension.
To better understand this option, please see $n$ in *note If
convolving afterwards::. This is useful when you want to convolve
the image afterwards. If you are using an external PSF, be sure to
oversample it to the same scale used for creating the mock images.
If a background image is specified, any possible value to this
option is ignored.
‘-c’
‘--prepforconv’
Shift all the profiles and enlarge the image based on half the
width of the first Moffat or Gaussian profile in the catalog,
considering any possible oversampling see *note If convolving
afterwards::. ‘--prepforconv’ is only checked and possibly
activated if ‘--xshift’ and ‘--yshift’ are both zero (after reading
the command-line and configuration files). If a background image
is specified, any possible value to this option is ignored.
‘-z FLT’
‘--zeropoint=FLT’
The zero point magnitude of the input. For more on the zero point
magnitude, see *note Brightness flux magnitude::.
‘-w FLT’
‘--circumwidth=FLT’
The width of the circumference if the profile is to be an
elliptical circumference or annulus. See the explanations for this
type of profile in ‘--fcol’.
‘-R’
‘--replace’
Do not add the pixels of each profile over the background, or other
profiles. But replace the values.
By default, when two profiles overlap, the final pixel value is the
sum of all the profiles that overlap on that pixel. This is the
expected situation when dealing with physical object profiles like
galaxies or stars/PSF. However, when MakeProfiles is used to build
integer labeled images (for example, in *note Aperture
photometry::), this is not the expected situation: the sum of two
labels will be a new label. With this option, the pixels are not
added but the largest (maximum) value over that pixel is used.
Because the maximum operator is independent of the order of values,
the output is also thread-safe.
---------- Footnotes ----------
(1) The minimum will give a better result, because the maximum can be
too high compared to most pixels in the image, making it harder to
display.
File: gnuastro.info, Node: MakeProfiles output dataset, Next: MakeProfiles log file, Prev: MakeProfiles profile settings, Up: Invoking astmkprof
8.1.4.3 MakeProfiles output dataset
...................................
MakeProfiles takes an input catalog uses basic properties that are
defined there to build a dataset, for example, a 2D image containing the
profiles in the catalog. In *note MakeProfiles catalog:: and *note
MakeProfiles profile settings::, the catalog and profile settings were
discussed. The options of this section, allow you to configure the
output dataset (or the canvas that will host the built profiles).
‘-k FITS’
‘--background=FITS’
A background image FITS file to build the profiles on. The
extension that contains the image should be specified with the
‘--backhdu’ option, see below. When a background image is
specified, it will be used to derive all the information about the
output image. Hence, the following options will be ignored:
‘--mergedsize’, ‘--oversample’, ‘--crpix’, ‘--crval’ (generally,
all other WCS related parameters) and the output's data type (see
‘--type’ in *note Input output options::).
The background image will act like a canvas to build the profiles
on: profile pixel values will be summed with the background image
pixel values. With the ‘--replace’ option you can disable this
behavior and replace the profile pixels with the background pixels.
If you want to use all the image information above, except for the
pixel values (you want to have a blank canvas to build the profiles
on, based on an input image), you can call ‘--clearcanvas’, to set
all the input image's pixels to zero before starting to build the
profiles over it (this is done in memory after reading the input,
so nothing will happen to your input file).
‘-B STR/INT’
‘--backhdu=STR/INT’
The header data unit (HDU) of the file given to ‘--background’.
‘-C’
‘--clearcanvas’
When an input image is specified (with the ‘--background’ option,
set all its pixels to 0.0 immediately after reading it into memory.
Effectively, this will allow you to use all its properties
(described under the ‘--background’ option), without having to
worry about the pixel values.
‘--clearcanvas’ can come in handy in many situations, for example,
if you want to create a labeled image (segmentation map) for
creating a catalog (see *note MakeCatalog::). In other cases, you
might have modeled the objects in an image and want to create them
on the same frame, but without the original pixel values.
‘-E STR/INT,FLT[,FLT,[...]]’
‘--kernel=STR/INT,FLT[,FLT,[...]]’
Only build one kernel profile with the parameters given as the
values to this option. The different values must be separated by a
comma (<,>). The first value identifies the radial function of the
profile, either through a string or through a number (see
description of ‘--fcol’ in *note MakeProfiles catalog::). Each
radial profile needs a different total number of parameters: Sérsic
and Moffat functions need 3 parameters: radial, Sérsic index or
Moffat $\beta$, and truncation radius. The Gaussian function needs
two parameters: radial and truncation radius. The point function
does not need any parameters and flat and circumference profiles
just need one parameter (truncation radius).
The PSF or kernel is a unique (and highly constrained) type of
profile: the sum of its pixels must be one, its center must be the
center of the central pixel (in an image with an odd number of
pixels on each side), and commonly it is circular, so its axis
ratio and position angle are one and zero respectively. Kernels
are commonly necessary for various data analysis and data
manipulation steps (for example, see *note Convolve::, and *note
NoiseChisel::. Because of this it is inconvenient to define a
catalog with one row and many zero valued columns (for all the
non-necessary parameters). Hence, with this option, it is possible
to create a kernel with MakeProfiles without the need to create a
catalog. Here are some examples:
‘--kernel=moffat,3,2.8,5’
A Moffat kernel with FWHM of 3 pixels, $\beta=2.8$ which is
truncated at 5 times the FWHM.
‘--kernel=gaussian,2,3’
A circular Gaussian kernel with FWHM of 2 pixels and truncated
at 3 times the FWHM.
This option may also be used to create a 3D kernel. To do that,
two small modifications are necessary: add a ‘-3d’ (or ‘-3D’) to
the profile name (for example, ‘moffat-3d’) and add a number
(axis-ratio along the third dimension) to the end of the parameters
for all profiles except ‘point’. The main reason behind providing
an axis ratio in the third dimension is that in 3D astronomical
datasets, commonly the third dimension does not have the same
nature (units/sampling) as the first and second.
For example, in IFU (optical) or Radio data cubes, the first and
second dimensions are commonly spatial/angular positions (like RA
and Dec) but the third dimension is wavelength or frequency (in
units of Angstroms for Herz). Because of this different nature
(which also affects the processing), it may be necessary for the
kernel to have a different extent in that direction.
If the 3rd dimension axis ratio is equal to $1.0$, then the kernel
will be a spheroid. If it is smaller than $1.0$, the kernel will
be button-shaped: extended less in the third dimension. However,
when it islarger than $1.0$, the kernel will be bullet-shaped:
extended more in the third dimension. In the latter case, the
radial parameter will correspond to the length along the 3rd
dimension. For example, let's have a look at the two examples
above but in 3D:
‘--kernel=moffat-3d,3,2.8,5,0.5’
An ellipsoid Moffat kernel with FWHM of 3 pixels, $\beta=2.8$
which is truncated at 5 times the FWHM. The ellipsoid is
circular in the first two dimensions, but in the third
dimension its extent is half the first two.
‘--kernel=gaussian-3d,2,3,1’
A spherical Gaussian kernel with FWHM of 2 pixels and
truncated at 3 times the FWHM.
Of course, if a specific kernel is needed that does not fit the
constraints imposed by this option, you can always use a catalog to
define any arbitrary kernel. Just call the ‘--individual’ and
‘--nomerged’ options to make sure that it is built as a separate
file (individually) and no "merged" image of the input profiles is
created.
‘-x INT,INT’
‘--mergedsize=INT,INT’
The number of pixels along each axis of the output, in FITS order.
This is before over-sampling. For example, if you call
MakeProfiles with ‘--mergedsize=100,150 --oversample=5’ (assuming
no shift due for later convolution), then the final image size
along the first axis will be 500 by 750 pixels. Fractions are
acceptable as values for each dimension, however, they must reduce
to an integer, so ‘--mergedsize=150/3,300/3’ is acceptable but
‘--mergedsize=150/4,300/4’ is not.
When viewing a FITS image in DS9, the first FITS dimension is in
the horizontal direction and the second is vertical. As an
example, the image created with the example above will have 500
pixels horizontally and 750 pixels vertically.
If a background image is specified, this option is ignored.
‘-s INT’
‘--oversample=INT’
The scale to over-sample the profiles and final image. If not an
odd number, will be added by one, see *note Oversampling::. Note
that this ‘--oversample’ will remain active even if an input image
is specified. If your input catalog is based on the background
image, be sure to set ‘--oversample=1’.
‘--psfinimg’
Build the possibly existing PSF profiles (Moffat or Gaussian) in
the catalog into the final image. By default they are built
separately so you can convolve your images with them, thus their
magnitude and positions are ignored. With this option, they will
be built in the final image like every other galaxy profile. To
have a final PSF in your image, make a point profile where you want
the PSF and after convolution it will be the PSF.
‘-i’
‘--individual’
If this option is called, each profile is created in a separate
FITS file within the same directory as the output and the row
number of the profile (starting from zero) in the name. The file
for each row's profile will be in the same directory as the final
combined image of all the profiles and will have the final image's
name as a suffix. So for example, if the final combined image is
named ‘./out/fromcatalog.fits’, then the first profile that will be
created with this option will be named ‘./out/0_fromcatalog.fits’.
Since each image only has one full profile out to the truncation
radius the profile is centered and so, only the sub-pixel position
of the profile center is important for the outputs of this option.
The output will have an odd number of pixels. If there is no
oversampling, the central pixel will contain the profile center.
If the value to ‘--oversample’ is larger than unity, then the
profile center is on any of the central ‘--oversample’'d pixels
depending on the fractional value of the profile center.
If the fractional value is larger than half, it is on the bottom
half of the central region. This is due to the FITS definition of
a real number position: The center of a pixel has fractional value
$0.00$ so each pixel contains these fractions: .5 - .75 - .00
(pixel center) - .25 - .5.
‘-m’
‘--nomerged’
Do not make a merged image. By default after making the profiles,
they are added to a final image with side lengths specified by
‘--mergedsize’ if they overlap with it.
The options below can be used to define the world coordinate system
(WCS) properties of the MakeProfiles outputs. The option names are
deliberately chosen to be the same as the FITS standard WCS keywords.
See Section 8 of Pence et al [2010]
(https://doi.org/10.1051/0004-6361/201015362) for a short introduction
to WCS in the FITS standard(1).
If you look into the headers of a FITS image with WCS for example,
you will see all these names but in uppercase and with numbers to
represent the dimensions, for example, ‘CRPIX1’ and ‘PC2_1’. You can
see the FITS headers with Gnuastro's *note Fits:: program using a
command like this: ‘$ astfits -p image.fits’.
If the values given to any of these options does not correspond to
the number of dimensions in the output dataset, then no WCS information
will be added. Also recall that if you use the ‘--background’ option,
all of these options are ignored. Such that if the image given to
‘--background’ does not have any WCS, the output of MakeProfiles will
also not have any WCS, even if these options are given(2).
‘--crpix=FLT,FLT’
The pixel coordinates of the WCS reference point. Fractions are
acceptable for the values of this option.
‘--crval=FLT,FLT’
The WCS coordinates of the Reference point. Fractions are
acceptable for the values of this option. The comma-separated
values can either be in degrees (a single number), or sexagesimal
(‘_h_m_’ for RA, ‘_d_m_’ for Dec, or ‘_:_:_’ for both). In any
case, the final value that will be written in the ‘CRVAL’ keyword
will be a floating point number in degrees (according to the FITS
standard).
‘--cdelt=FLT,FLT’
The resolution (size of one data-unit or pixel in WCS units) of the
non-oversampled dataset. Fractions are acceptable for the values
of this option.
‘--pc=FLT,FLT,FLT,FLT’
The PC matrix of the WCS rotation, see the FITS standard (link
above) to better understand the PC matrix.
‘--cunit=STR,STR’
The units of each WCS axis, for example, ‘deg’. Note that these
values are part of the FITS standard (link above). MakeProfiles
will not complain if you use non-standard values, but later usage
of them might cause trouble.
‘--ctype=STR,STR’
The type of each WCS axis, for example, ‘RA---TAN’ and ‘DEC--TAN’.
Note that these values are part of the FITS standard (link above).
MakeProfiles will not complain if you use non-standard values, but
later usage of them might cause trouble.
---------- Footnotes ----------
(1) The world coordinate standard in FITS is a very beautiful and
powerful concept to link/associate datasets with the outside world
(other datasets). The description in the FITS standard (link above)
only touches the tip of the ice-burg. To learn more please see Greisen
and Calabretta [2002] (https://doi.org/10.1051/0004-6361:20021326),
Calabretta and Greisen [2002]
(https://doi.org/10.1051/0004-6361:20021327), Greisen et al. [2006]
(https://doi.org/10.1051/0004-6361:20053818), and Calabretta et al.
(http://www.atnf.csiro.au/people/mcalabre/WCS/dcs_20040422.pdf)
(2) If you want to add profiles _and_ WCS over the background image
(to produce your output), you need more than one command: 1. You should
use ‘--mergedsize’ in MakeProfiles to manually set the output number of
pixels equal to your desired background image (so the background is
zero). In this mode, you can use these WCS-related options to define
the WCS. 2. Then use Arithmetic to add the pixels of your mock image to
the background (see *note Arithmetic::.