gnuastro (0.22)
This is gnuastro.info, produced by makeinfo version 7.1 from
gnuastro.texi.
This book documents version 0.22 of the GNU Astronomy Utilities
(Gnuastro). Gnuastro provides various programs and libraries for
astronomical data manipulation and analysis.
Copyright © 2015-2024 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,
Version 1.3 or any later version published by the Free Software
Foundation; with no Invariant Sections, no Front-Cover Texts, and
no Back-Cover Texts. A copy of the license is included in the
section entitled "GNU Free Documentation License".
INFO-DIR-SECTION Astronomy
START-INFO-DIR-ENTRY
* Gnuastro: (gnuastro). GNU Astronomy Utilities.
* libgnuastro: (gnuastro)Gnuastro library. Full Gnuastro library doc.
* help-gnuastro: (gnuastro)help-gnuastro mailing list. Getting help.
* bug-gnuastro: (gnuastro)Report a bug. How to report bugs
* Arithmetic: (gnuastro)Arithmetic. Arithmetic operations on pixels.
* astarithmetic: (gnuastro)Invoking astarithmetic. Options to Arithmetic.
* BuildProgram: (gnuastro)BuildProgram. Compile and run programs using Gnuastro's library.
* astbuildprog: (gnuastro)Invoking astbuildprog. Options to BuildProgram.
* ConvertType: (gnuastro)ConvertType. Convert different file types.
* astconvertt: (gnuastro)Invoking astconvertt. Options to ConvertType.
* Convolve: (gnuastro)Convolve. Convolve an input file with kernel.
* astconvolve: (gnuastro)Invoking astconvolve. Options to Convolve.
* CosmicCalculator: (gnuastro)CosmicCalculator. For cosmological params.
* astcosmiccal: (gnuastro)Invoking astcosmiccal. Options to CosmicCalculator.
* Crop: (gnuastro)Crop. Crop region(s) from image(s).
* astcrop: (gnuastro)Invoking astcrop. Options to Crop.
* Fits: (gnuastro)Fits. View and manipulate FITS extensions and keywords.
* astfits: (gnuastro)Invoking astfits. Options to Fits.
* MakeCatalog: (gnuastro)MakeCatalog. Make a catalog from labeled image.
* astmkcatalog: (gnuastro)Invoking astmkcatalog. Options to MakeCatalog.
* MakeProfiles: (gnuastro)MakeProfiles. Make mock profiles.
* astmkprof: (gnuastro)Invoking astmkprof. Options to MakeProfiles.
* Match: (gnuastro)Match. Match two separate catalogs.
* astmatch: (gnuastro)Invoking astmatch. Options to Match.
* NoiseChisel: (gnuastro)NoiseChisel. Detect signal in noise.
* astnoisechisel: (gnuastro)Invoking astnoisechisel. Options to NoiseChisel.
* Segment: (gnuastro)Segment. Segment detections based on signal structure.
* astsegment: (gnuastro)Invoking astsegment. Options to Segment.
* Query: (gnuastro)Query. Access remote databases for downloading data.
* astquery: (gnuastro)Invoking astquery. Options to Query.
* Statistics: (gnuastro)Statistics. Get image Statistics.
* aststatistics: (gnuastro)Invoking aststatistics. Options to Statistics.
* Table: (gnuastro)Table. Read and write FITS binary or ASCII tables.
* asttable: (gnuastro)Invoking asttable. Options to Table.
* Warp: (gnuastro)Warp. Warp a dataset to a new grid.
* astwarp: (gnuastro)Invoking astwarp. Options to Warp.
* astscript: (gnuastro)Installed scripts. Gnuastro's installed scripts.
* astscript-ds9-region: (gnuastro)Invoking astscript-ds9-region. Options to this script
* astscript-fits-view: (gnuastro)Invoking astscript-fits-view. Options to this script
* astscript-pointing-simulate: (gnuastro)Invoking astscript-pointing-simulate. Options to this script
* astscript-psf-scale-factor: (gnuastro)Invoking astscript-psf-scale-factor. Options to this script
* astscript-psf-select-stars: (gnuastro)Invoking astscript-psf-select-stars. Options to this script
* astscript-psf-stamp: (gnuastro)Invoking astscript-psf-stamp. Options to this script
* astscript-psf-subtract: (gnuastro)Invoking astscript-psf-subtract. Options to this script
* astscript-psf-unite: (gnuastro)Invoking astscript-psf-unite. Options to this script
* astscript-radial-profile: (gnuastro)Invoking astscript-radial-profile. Options to this script
* astscript-sort-by-night: (gnuastro)Invoking astscript-sort-by-night. Options to this script
* astscript-zeropoint: (gnuastro)Invoking astscript-zeropoint. Options to this script
END-INFO-DIR-ENTRY
File: gnuastro.info, Node: Query, Prev: Table, Up: Data containers
5.4 Query
=========
There are many astronomical databases available for downloading
astronomical data. Most follow the International Virtual Observatory
Alliance (IVOA, <https://ivoa.net>) standards (and in particular the
Table Access Protocol, or TAP(1)). With TAP, it is possible to submit
your queries via a command-line downloader (for example, ‘curl’) to only
get specific tables, targets (rows in a table) or measurements (columns
in a table): you do not have to download the full table (which can be
very large in some cases)! These customizations are done through the
Astronomical Data Query Language (ADQL(2)).
Therefore, if you are sufficiently familiar with TAP and ADQL, you
can easily custom-download any part of an online dataset. However, you
also need to keep a record of the URLs of each database and in many
cases, the commands will become long and hard/buggy to type on the
command-line. On the other hand, most astronomers do not know TAP or
ADQL at all, and are forced to go to the database's web page which is
slow (it needs to download so many images, and has too much annoying
information), requires manual interaction (further making it slow and
buggy), and cannot be automated.
Gnuastro's Query program is designed to be the middle-man in this
process: it provides a simple high-level interface to let you specify
your constraints on what you want to download. It then internally
constructs the command to download the data based on your inputs and
runs it to download your desired data. Query also prints the full
command before it executes it (if not called with ‘--quiet’). Also, if
you ask for a FITS output table, the full command is written into its
0-th extension along with other input parameters to query (all Gnuastro
programs generally keep their input configuration parameters as FITS
keywords in the zero-th output). You can see it with Gnuastro's Fits
program, like below:
$ astfits query-output.fits -h0
With the full command used to download the dataset, you only need a
minimal knowledge of ADQL to do lower-level customizations on your
downloaded dataset. You can simply copy that command and change the
parts of the query string you want: ADQL is very powerful! For example,
you can ask the server to do mathematical operations on the columns and
apply selections after those operations, or combine/match multiple
datasets. We will try to add high-level interfaces for such
capabilities, but generally, do not limit yourself to the high-level
operations (that cannot cover everything!).
* Menu:
* Available databases:: List of available databases to Query.
* Invoking astquery:: Inputs, outputs and configuration of Query.
---------- Footnotes ----------
(1) <https://ivoa.net/documents/TAP>
(2) <https://ivoa.net/documents/ADQL>
File: gnuastro.info, Node: Available databases, Next: Invoking astquery, Prev: Query, Up: Query
5.4.1 Available databases
-------------------------
The current list of databases supported by Query are listed at the end
of this section. To get the list of available datasets within each
database, you can use the ‘--information’ option. for example, with the
command below you can get a list of the roughly 100 datasets that are
available within the ESA Gaia server with their description:
$ astquery gaia --information
However, other databases like VizieR host many more datasets (tens of
thousands!). Therefore it is very inconvenient to get the _full_
information every time you want to find your dataset of interest (the
full metadata file VizieR is more than 20Mb). In such cases, you can
limit the downloaded and displayed information with the ‘--limitinfo’
option. For example, with the first command below, you can get all
datasets relating to the MUSE (an instrument on the Very Large
Telescope), and those that include Roland Bacon (Principle Investigator
of MUSE) as an author (‘Bacon, R.’). Recall that ‘-i’ is the short
format of ‘--information’.
$ astquery vizier -i --limitinfo=MUSE
$ astquery vizier -i --limitinfo="Bacon R."
Once you find the recognized name of your desired dataset, you can
see the column information of that dataset with adding the dataset name.
For example, with the command below you can see the column metadata in
the ‘J/A+A/608/A2/udf10’ dataset (one of the datasets in the search
above) using this command:
$ astquery vizier --dataset=J/A+A/608/A2/udf10 -i
For very popular datasets of a database, Query provides an
easier-to-remember short name that you can feed to ‘--dataset’. This
short name will map to the officially recognized name of the dataset on
the server. In this mode, Query will also set positional columns
accordingly. For example, most VizieR datasets have an ‘RAJ2000’ column
(the RA and the epoch of 2000) so it is the default RA column name for
coordinate search (using ‘--center’ or ‘--overlapwith’). However, some
datasets do not have this column (for example, SDSS DR12). So when you
use the short name and Query knows about this dataset, it will
internally set the coordinate columns that SDSS DR12 has: ‘RA_ICRS’ and
‘DEC_ICRS’. Recall that you can always change the coordinate columns
with ‘--ccol’.
For example, in the VizieR and Gaia databases, the recognized name
for data release 3 data is respectively ‘I/355/gaiadr3’ and
‘gaiadr3.gaia_source’. These technical names are hard to remember.
Therefore Query provides ‘gaiadr3’ (for VizieR) and ‘dr3’ (for ESA's
Gaia database) shortcuts which you can give to ‘--dataset’ instead.
They will be internally mapped to the fully recognized name by Query.
In the list below that describes the available databases, the available
short names, that are recognized for each, are also listed.
*Not all datasets support TAP:* Large databases like VizieR have TAP
access for all their datasets. However, smaller databases have not
implemented TAP for all their tables. Therefore some datasets that are
searchable in their web interface may not be available for a TAP search.
To see the full list of TAP-ed datasets in a database, use the
‘--information’ (or ‘-i’) option with the dataset name like the command
below.
$ astquery astron -i
If your desired dataset is not in this list, but has web-access, contact
the database maintainers and ask them to add TAP access for it. After
they do it, you should see the name added to the output list of the
command above.
The list of databases recognized by Query (and their names in Query)
is described below. Since Query is a new member of the Gnuastro family
(first available in Gnuastro 0.14), this list will hopefully grow
significantly in the next releases. If you have any particular datasets
in mind, please let us know by sending an email to
‘bug-gnuastro@gnu.org’. If the dataset supports IVOA's TAP (Table
Access Protocol), it should be very easy to add.
‘astron’
The ASTRON Virtual Observatory service (<https://vo.astron.nl>) is
a database focused on radio astronomy data and images, primarily
those collected by ASTRON itself. A query to ‘astron’ is submitted
to ‘https://vo.astron.nl/__system__/tap/run/tap/sync’.
Here is the list of short names for dataset(s) in ASTRON's VO
service:
• ‘tgssadr --> tgssadr.main’
‘gaia’
The Gaia project (<https://www.cosmos.esa.int/web/gaia>) database
which is a large collection of star positions on the celestial
sphere, as well as peculiar velocities, parallaxes and magnitudes
in some bands among many others. Besides scientific studies (like
studying resolved stellar populations in the Galaxy and its halo),
Gaia is also invaluable for raw data calibrations, like astrometry.
A query to ‘gaia’ is submitted to
‘https://gea.esac.esa.int/tap-server/tap/sync’.
Here is the list of short names for popular datasets within Gaia:
• ‘dr3 --> gaiadr3.gaia_source’
• ‘edr3 --> gaiaedr3.gaia_source’
• ‘dr2 --> gaiadr2.gaia_source’
• ‘dr1 --> gaiadr1.gaia_source’
• ‘tycho2 --> public.tycho2’
• ‘hipparcos --> public.hipparcos’
‘ned’
The NASA/IPAC Extragalactic Database (NED,
<http://ned.ipac.caltech.edu>) is a fusion database, integrating
the information about extra-galactic sources from many large sky
surveys into a single catalog. It covers the full spectrum, from
Gamma rays to radio frequencies and is updated when new data
arrives. A TAP query to ‘ned’ is submitted to
‘https://ned.ipac.caltech.edu/tap/sync’.
• ‘objdir --> NEDTAP.objdir’: default TAP-based dataset in NED.
• ‘extinction’: A command-line interface to the NED Extinction
Calculator
(https://ned.ipac.caltech.edu/extinction_calculator). It only
takes a central coordinate and returns a VOTable of the
calculated extinction in many commonly used filters at that
point. As a result, options like ‘--width’ or ‘--radius’ are
not supported. However, Gnuastro does not yet support the
VOTable format. Therefore, if you specify an ‘--output’ file,
it should have an ‘.xml’ suffix and the downloaded file will
not be checked.
Until VOTable support is added to Gnuastro, you can use GREP,
AWK and SED to convert the VOTable data into a FITS table with
a command like below (assuming the queried VOTable is called
‘ned-extinction.xml’):
grep '^<TR><TD>' ned-extinction.xml \
| sed -e's|<TR><TD>||' \
-e's|</TD></TR>||' \
-e's|</TD><TD>|@|g' \
| awk 'BEGIN{FS="@"; \
print "# Column 1: FILTER [name,str15] Filter name"; \
print "# Column 2: CENTRAL [um,f32] Central Wavelength"; \
print "# Column 3: EXTINCTION [mag,f32] Galactic Ext."; \
print "# Column 4: ADS_REF [ref,str50] ADS reference"} \
{printf "%-15s %g %g %s\n", $1, $2, $3, $4}' \
| asttable -oned-extinction.fits
Once the table is in FITS, you can easily get the extinction
for a certain filter (for example, the ‘SDSS r’ filter) like
the command below:
asttable ned-extinction.fits --equal=FILTER,"SDSS r" \
-cEXTINCTION
‘vizier’
Vizier (<https://vizier.u-strasbg.fr>) is arguably the largest
catalog database in astronomy: containing more than 20500 catalogs
as of mid January 2021. Almost all published catalogs in major
projects, and even the tables in many papers are archived and
accessible here. For example, VizieR also has a full copy of the
Gaia database mentioned below, with some additional standardized
columns (like RA and Dec in J2000).
The current implementation of ‘--limitinfo’ only looks into the
description of the datasets, but since VizieR is so large, there is
still a lot of room for improvement. Until then, if ‘--limitinfo’
is not sufficient, you can use VizieR's own web-based search for
your desired dataset: <http://cdsarc.u-strasbg.fr/viz-bin/cat>
Because VizieR curates such a diverse set of data from tens of
thousands of projects and aims for interoperability between them,
the column names in VizieR may not be identical to the column names
in the surveys' own databases (Gaia in the example above). A query
to ‘vizier’ is submitted to
‘http://tapvizier.u-strasbg.fr/TAPVizieR/tap/sync’.
Here is the list of short names for popular datasets within VizieR
(sorted alphabetically by their short name). Please feel free to
suggest other major catalogs (covering a wide area or commonly used
in your field).. For details on each dataset with necessary
citations, and links to web pages, look into their details with
their ViziR names in <https://vizier.u-strasbg.fr/viz-bin/VizieR>.
• ‘2mass --> II/246/out’ (2MASS All-Sky Catalog)
• ‘akarifis --> II/298/fis’ (AKARI/FIS All-Sky Survey)
• ‘allwise --> II/328/allwise’ (AllWISE Data Release)
• ‘apass9 --> II/336/apass9’ (AAVSO Photometric All Sky Survey,
DR9)
• ‘catwise --> II/365/catwise’ (CatWISE 2020 catalog)
• ‘des1 --> II/357/des_dr1’ (Dark Energy Survey data release 1)
• ‘gaiadr3 --> I/355/gaiadr3’ (GAIA Data Release 3)
• ‘gaiaedr3 --> I/350/gaiadr3’ (GAIA Early Data Release 3)
• ‘gaiadr2 --> I/345/gaia2’ (GAIA Data Release 2)
• ‘galex5 --> II/312/ais’ (All-sky Survey of GALEX DR5)
• ‘nomad --> I/297/out’ (Naval Observatory Merged Astrometric
Dataset)
• ‘panstarrs1 --> II/349/ps1’ (Pan-STARRS Data Release 1).
• ‘ppmxl --> I/317/sample’ (Positions and proper motions on the
ICRS)
• ‘sdss12 --> V/147/sdss12’ (SDSS Photometric Catalogue, Release
12)
• ‘usnob1 --> I/284/out’ (Whole-Sky USNO-B1.0 Catalog)
• ‘ucac5 --> I/340/ucac5’ (5th U.S. Naval Obs. CCD Astrograph
Catalog)
• ‘unwise --> II/363/unwise’ (Band-merged unWISE Catalog)
• ‘wise --> II/311/wise’ (WISE All-Sky data Release)
File: gnuastro.info, Node: Invoking astquery, Prev: Available databases, Up: Query
5.4.2 Invoking Query
--------------------
Query provides a high-level interface to downloading subsets of data
from databases. The executable name is ‘astquery’ with the following
general template
$ astquery DATABASE-NAME [OPTION...] ...
One line examples:
## Information about all datasets in ESA's GAIA database:
$ astquery gaia --information
## Only show catalogs in VizieR that have 'MUSE' in their
## description. The '-i' is short for '--information'.
$ astquery vizier -i --limitinfo=MUSE
## List of columns in 'J/A+A/608/A2/udf10' (one of the above).
$ astquery vizier --dataset=J/A+A/608/A2/udf10 -i
## ID, RA and Dec of all Gaia sources within an image.
$ astquery gaia --dataset=dr3 --overlapwith=image.fits \
-csource_id,ra,dec
## RA, Dec and Spectroscopic redshifts of objects in SDSS DR12
## spectroscopic redshift that overlap with 'image.fits'.
$ astquery vizier --dataset=sdss12 --overlapwith=image.fits \
-cRA_ICRS,DE_ICRS,zsp --range=zsp,1e-10,inf
## All columns of all entries in the Gaia DR3 catalog (hosted at
## VizieR) within 1 arc-minute of the given coordinate.
$ astquery vizier --dataset=gaiadr3 --output=my-gaia.fits \
--center=113.8729761,31.9027152 --radius=1/60 \
## Similar to above, but only ID, RA and Dec columns for objects with
## magnitude range 10 to 15. In VizieR, this column is called 'Gmag'.
## Also, using sexagesimal coordinates instead of degrees for center.
$ astquery vizier --dataset=gaiadr3 --output=my-gaia.fits \
--center=07h35m29.51,31d54m9.77 --radius=1/60 \
--range=Gmag,10:15 -cDR3Name,RAJ2000,DEJ2000
Query takes a single argument which is the name of the database. For
the full list of available databases and accessing them, see *note
Available databases::. There are two methods to query the databases,
each is more fully discussed in its option's description below.
• *Low-level:* With ‘--query’ you can directly give a raw query
statement that is recognized by the database. This is very low
level and will require a good knowledge of the database's query
language, but of course, it is much more powerful. If this option
is given, the raw string is directly passed to the server and all
other constraints/options (for Query's high-level interface) are
ignored.
• *High-level:* With the high-level options (like ‘--column’,
‘--center’, ‘--radius’, ‘--range’ and other constraining options
below), the low-level query will be constructed automatically for
the particular database. This method is only limited to the
generic capabilities that Query provides for all servers. So
‘--query’ is more powerful, however, in this mode, you do not need
any knowledge of the database's query language. You can see the
internally generated query on the terminal (if ‘--quiet’ is not
used) or in the 0-th extension of the output (if it is a FITS
file). This full command contains the internally generated query.
The name of the downloaded output file can be set with ‘--output’.
The requested output format can have any of the *note Recognized table
formats:: (currently ‘.txt’ or ‘.fits’). Like all Gnuastro programs, if
the output is a FITS file, the zero-th/first HDU of the output will
contain all the command-line options given to Query as well as the full
command used to access the server. When ‘--output’ is not set, the
output name will be in the format of ‘NAME-STRING.fits’, where ‘NAME’ is
the name of the database and ‘STRING’ is a randomly selected 6-character
set of numbers and alphabetic characters. With this feature, a second
run of ‘astquery’ that is not called with ‘--output’ will not over-write
an already downloaded one. Generally, when calling Query more than
once, it is recommended to set an output name for each call based on
your project's context.
The outputs of Query will have a common output format, irrespective
of the used database. To achieve this, Query will ask the databases to
provide a FITS table output (for larger tables, FITS can consume much
less download volume). After downloading is complete, the raw
downloaded file will be read into memory once by Query, and written into
the file given to ‘--output’. The raw downloaded file will be deleted
by default, but can be preserved with the ‘--keeprawdownload’ option.
This strategy avoids unnecessary surprises depending on database. For
example, some databases can download a compressed FITS table, even
though we ask for FITS. But with the strategy above, the final output
will be an uncompressed FITS file. The metadata that is added by Query
(including the full download command) is also very useful for future
usage of the downloaded data. Unfortunately many databases do not write
the input queries into their generated tables.
‘--dry-run’
Only print the final download command to contact the server, do not
actually run it. This option is good when you want to check the
finally constructed query or download options given to the download
program. You may also want to use the constructed command as a
base to do further customizations on it and run it yourself.
‘-k’
‘--keeprawdownload’
Do not delete the raw downloaded file from the database. The name
of the raw download will have a ‘OUTPUT-raw-download.fits’ format.
Where ‘OUTPUT’ is either the base-name of the final output file
(without a suffix).
‘-i’
‘--information’
Print the information of all datasets (tables) within a database or
all columns within a database. When ‘--dataset’ is specified, the
latter mode (all column information) is downloaded and printed and
when it is not defined, all dataset information (within the
database) is printed.
Some databases (like VizieR) contain tens of thousands of datasets,
so you can limit the downloaded and printed information for
available databases with the ‘--limitinfo’ option (described
below). Dataset descriptions are often large and contain a lot of
text (unlike column descriptions). Therefore when printing the
information of all datasets within a database, the information
(e.g., database name) will be printed on separate lines before the
description. However, when printing column information, the output
has the same format as a similar option in Table (see *note
Invoking asttable::).
Important note to consider: the printed order of the datasets or
columns is just for displaying in the printed output. You cannot
ask for datasets or columns based on the printed order, you need to
use dataset or column names.
‘-L STR’
‘--limitinfo=STR’
Limit the information that is downloaded and displayed (with
‘--information’) to those that have the string given to this option
in their description. Note that _this is case-sensitive_. This
option is only relevant when ‘--information’ is also called.
Databases may have thousands (or tens of thousands) of datasets.
Therefore just the metadata (information) to show with
‘--information’ can be tens of megabytes (for example, the full
VizieR metadata file is about 23Mb as of January 2021). Once
downloaded, it can also be hard to parse manually. With
‘--limitinfo’, only the metadata of datasets that contain this
string _in their description_ will be downloaded and displayed,
greatly improving the speed of finding your desired dataset.
‘-Q "STR"’
‘--query="STR"’
Directly specify the query to be passed onto the database. The
queries will generally contain space and other meta-characters, so
we recommend placing the query within quotations.
‘-s STR’
‘--dataset=STR’
The dataset to query within the database (not compatible with
‘--query’). This option is mandatory when ‘--query’ or
‘--information’ are not provided. You can see the list of
available datasets within a database using ‘--information’
(possibly supplemented by ‘--limitinfo’). The output of
‘--information’ will contain the recognized name of the datasets
within that database. You can pass the recognized name directly to
this option. For more on finding and using your desired database,
see *note Available databases::.
‘-c STR’
‘--column=STR[,STR[,...]]’
The column name(s) to retrieve from the dataset in the given order
(not compatible with ‘--query’). If not given, all the dataset's
columns for the selected rows will be queried (which can be
large!). This option can take multiple values in one instance (for
example, ‘--column=ra,dec,mag’), or in multiple instances (for
example, ‘-cra -cdec -cmag’), or mixed (for example, ‘-cra,dec
-cmag’).
In case, you do not know the full list of the dataset's column
names a-priori, and you do not want to download all the columns
(which can greatly decrease your download speed), you can use the
‘--information’ option combined with the ‘--dataset’ option, see
*note Available databases::.
‘-H INT’
‘--head=INT’
Only ask for the first ‘INT’ rows of the finally selected columns,
not all the rows. This can be good when your search can result a
large dataset, but before downloading the full volume, you want to
see the top rows and get a feeling of what the whole dataset looks
like.
‘-v FITS’
‘--overlapwith=FITS’
File name of FITS file containing an image (in the HDU given by
‘--hdu’) to use for identifying the region to query in the give
database and dataset. Based on the image's WCS and pixel size, the
sky coverage of the image is estimated and values to the
‘--center’, ‘--width’ will be calculated internally. Hence this
option cannot be used with ‘--center’, ‘--width’ or ‘--radius’.
Also, since it internally generates the query, it cannot be used
with ‘--query’.
Note that if the image has WCS distortions and the reference point
for the WCS is not within the image, the WCS will not be
well-defined. Therefore the resulting catalog may not overlap, or
correspond to a larger/small area in the sky.
‘-C FLT,FLT’
‘--center=FLT,FLT’
The spatial center position (mostly RA and Dec) to use for the
automatically generated query (not compatible with ‘--query’). The
comma-separated values can either be in degrees (a single number),
or sexagesimal (‘_h_m_’ for RA, ‘_d_m_’ for Dec, or ‘_:_:_’ for
both).
The given values will be compared to two columns in the database to
find/return rows within a certain region around this center
position will be requested and downloaded. Pre-defined RA and Dec
column names are defined in Query for every database, however you
can use ‘--ccol’ to select other columns to use instead. The
region can either be a circle and the point (configured with
‘--radius’) or a box/rectangle around the point (configured with
‘--width’).
‘--ccol=STR,STR’
The name of the coordinate-columns in the dataset to compare with
the values given to ‘--center’. Query will use its internal
defaults for each dataset (for example, ‘RAJ2000’ and ‘DEJ2000’ for
VizieR data). But each dataset is treated separately and it is not
guaranteed that these columns exist in all datasets. Also, more
than one coordinate system/epoch may be present in a dataset and
you can use this option to construct your spatial constraint based
on the others coordinate systems/epochs.
‘-r FLT’
‘--radius=FLT’
The radius about the requested center to use for the automatically
generated query (not compatible with ‘--query’). The radius is in
units of degrees, but you can use simple division with this option
directly on the command-line. For example, if you want a radius of
20 arc-minutes or 20 arc-seconds, you can use ‘--radius=20/60’ or
‘--radius=20/3600’ respectively (which is much more human-friendly
than ‘0.3333’ or ‘0.005556’).
‘-w FLT[,FLT]’
‘--width=FLT[,FLT]’
The square (or rectangle) side length (width) about the requested
center to use for the automatically generated query (not compatible
with ‘--query’). If only one value is given to ‘--width’ the
region will be a square, but if two values are given, the widths of
the query box along each dimension will be different. The value(s)
is (are) in the same units as the coordinate column (see ‘--ccol’,
usually RA and Dec which are degrees). You can use simple division
for each value directly on the command-line if you want relatively
small (and more human-friendly) sizes. For example, if you want
your box to be 1 arc-minutes along the RA and 2 arc-minutes along
Dec, you can use ‘--width=1/60,2/60’.
‘-g STR,FLT,FLT’
‘--range=STR,FLT,FLT’
The column name and numerical range (inclusive) of acceptable
values in that column (not compatible with ‘--query’). This option
can be called multiple times for applying range limits on many
columns in one call (thus greatly reducing the download size). For
example, when used on the ESA gaia database, you can use
‘--range=phot_g_mean_mag,10:15’ to only get rows that have a value
between 10 and 15 (inclusive on both sides) in the
‘phot_g_mean_mag’ column.
If you want all rows larger, or smaller, than a certain number, you
can use ‘inf’, or ‘-inf’ as the first or second values
respectively. For example, if you want objects with SDSS
spectroscopic redshifts larger than 2 (from the VizieR ‘sdss12’
database), you can use ‘--range=zsp,2,inf’
If you want the interval to not be inclusive on both sides, you can
run ‘astquery’ once and get the command that it executes. Then you
can edit it to be non-inclusive on your desired side.
‘-b STR[,STR]’
‘--noblank=STR[,STR]’
Only ask for rows that do not have a blank value in the ‘STR’
column. This option can be called many times, and each call can
have multiple column names (separated by a comma or <,>). For
example, if you want the retrieved rows to not have a blank value
in columns ‘A’, ‘B’, ‘C’ and ‘D’, you can use ‘--noblank=A
-bB,C,D’.
‘--sort=STR[,STR]’
Ask for the server to sort the downloaded data based on the given
columns. For example, let's assume your desired catalog has column
‘Z’ for redshift and column ‘MAG_R’ for magnitude in the R band.
When you call ‘--sort=Z,MAG_R’, it will primarily sort the columns
based on the redshift, but if two objects have the same redshift,
they will be sorted by magnitude. You can add as many columns as
you like for higher-level sorting.
File: gnuastro.info, Node: Data manipulation, Next: Data analysis, Prev: Data containers, Up: Top
6 Data manipulation
*******************
Images are one of the major formats of data that is used in astronomy.
The functions in this chapter explain the GNU Astronomy Utilities which
are provided for their manipulation. For example, cropping out a part
of a larger image or convolving the image with a given kernel or
applying a transformation to it.
* Menu:
* Crop:: Crop region(s) from a dataset.
* Arithmetic:: Arithmetic on input data.
* Convolve:: Convolve an image with a kernel.
* Warp:: Warp/Transform an image to a different grid.
File: gnuastro.info, Node: Crop, Next: Arithmetic, Prev: Data manipulation, Up: Data manipulation
6.1 Crop
========
Astronomical images are often very large, filled with thousands of
galaxies. It often happens that you only want a section of the image,
or you have a catalog of sources and you want to visually analyze them
in small postage stamps. Crop is made to do all these things. When
more than one crop is required, Crop will divide the crops between
multiple threads to significantly reduce the run time.
Astronomical surveys are usually extremely large. So large in fact,
that the whole survey will not fit into a reasonably sized file.
Because of this, surveys usually cut the final image into separate tiles
and store each tile in a file. For example, the COSMOS survey's Hubble
space telescope, ACS F814W image consists of 81 separate FITS images,
with each one having a volume of 1.7 Gigabytes.
Even though the tile sizes are chosen to be large enough that too
many galaxies/targets do not fall on the edges of the tiles, inevitably
some do. So when you simply crop the image of such targets from one
tile, you will miss a large area of the surrounding sky (which is
essential in estimating the noise). Therefore in its WCS mode, Crop
will stitch parts of the tiles that are relevant for a target (with the
given width) from all the input images that cover that region into the
output. Of course, the tiles have to be present in the list of input
files.
Besides cropping postage stamps around certain coordinates, Crop can
also crop arbitrary polygons from an image (or a set of tiles by
stitching the relevant parts of different tiles within the polygon), see
‘--polygon’ in *note Invoking astcrop::. Alternatively, it can crop out
rectangular regions through the ‘--section’ option from one image, see
*note Crop section syntax::.
* Menu:
* Crop modes:: Basic modes to define crop region.
* Crop section syntax:: How to define a section to crop.
* Blank pixels:: Pixels with no value.
* Invoking astcrop:: Calling Crop on the command-line
File: gnuastro.info, Node: Crop modes, Next: Crop section syntax, Prev: Crop, Up: Crop
6.1.1 Crop modes
----------------
In order to be comprehensive, intuitive, and easy to use, there are two
ways to define the crop:
1. From its center and side length. For example, if you already know
the coordinates of an object and want to inspect it in an image or
to generate postage stamps of a catalog containing many such
coordinates.
2. The vertices of the crop region, this can be useful for larger
crops over many targets, for example, to crop out a uniformly deep,
or contiguous, region of a large survey.
Irrespective of how the crop region is defined, the coordinates to
define the crop can be in Image (pixel) or World Coordinate System (WCS)
standards. All coordinates are read as floating point numbers (not
integers, except for the ‘--section’ option, see below). By setting the
_mode_ in Crop, you define the standard that the given coordinates must
be interpreted. Here, the different ways to specify the crop region are
discussed within each standard. For the full list options, please see
*note Invoking astcrop::.
When the crop is defined by its center, the respective (integer)
central pixel position will be found internally according to the FITS
standard. To have this pixel positioned in the center of the cropped
region, the final cropped region will have an add number of pixels (even
if you give an even number to ‘--width’ in image mode).
Furthermore, when the crop is defined as by its center, Crop allows
you to only keep crops what do not have any blank pixels in the vicinity
of their center (your primary target). This can be very convenient when
your input catalog/coordinates originated from another survey/filter
which is not fully covered by your input image, to learn more about this
feature, please see the description of the ‘--checkcenter’ option in
*note Invoking astcrop::.
Image coordinates
In image mode (‘--mode=img’), Crop interprets the pixel coordinates
and widths in units of the input data-elements (for example, pixels
in an image, not world coordinates). In image mode, only one image
may be input. The output crop(s) can be defined in multiple ways
as listed below.
Center of multiple crops (in a catalog)
The center of (possibly multiple) crops are read from a text
file. In this mode, the columns identified with the
‘--coordcol’ option are interpreted as the center of a crop
with a width of ‘--width’ pixels along each dimension. The
columns can contain any floating point value. The value to
‘--output’ option is seen as a directory which will host (the
possibly multiple) separate crop files, see *note Crop
output:: for more. For a tutorial using this feature, please
see *note Reddest clumps cutouts and parallelization::.
Center of a single crop (on the command-line)
The center of the crop is given on the command-line with the
‘--center’ option. The crop width is specified by the
‘--width’ option along each dimension. The given coordinates
and width can be any floating point number.
Vertices of a single crop
In Image mode there are two options to define the vertices of
a region to crop: ‘--section’ and ‘--polygon’. The former is
lower-level (does not accept floating point vertices, and only
a rectangular region can be defined), it is also only
available in Image mode. Please see *note Crop section
syntax:: for a full description of this method.
The latter option (‘--polygon’) is a higher-level method to
define any polygon (with any number of vertices) with floating
point values. Please see the description of this option in
*note Invoking astcrop:: for its syntax.
WCS coordinates
In WCS mode (‘--mode=wcs’), the coordinates and width are
interpreted using the World Coordinate System (WCS, that must
accompany the dataset), not pixel coordinates. You can optionally
use ‘--widthinpix’ for the width to be interpreted in pixels (even
though the coordinates are in WCS). In WCS mode, Crop accepts
multiple datasets as input. When the cropped region (defined by
its center or vertices) overlaps with multiple of the input
images/tiles, the overlapping regions will be taken from the
respective input (they will be stitched when necessary for each
output crop).
In this mode, the input images do not necessarily have to be the
same size, they just need to have the same orientation and pixel
resolution. Currently only orientation along the celestial
coordinates is accepted, if your input has a different orientation
or resolution you can use Warp's ‘--gridfile’ option to align the
image before cropping it (see *note Warp::).
Each individual input image/tile can even be smaller than the final
crop. In any case, any part of any of the input images which
overlaps with the desired region will be used in the crop. Note
that if there is an overlap in the input images/tiles, the pixels
from the last input image read are going to be used for the
overlap. Crop will not change pixel values, so it assumes your
overlapping tiles were cutout from the same original image. There
are multiple ways to define your cropped region as listed below.
Center of multiple crops (in a catalog)
Similar to catalog inputs in Image mode (above), except that
the values along each dimension are assumed to have the same
units as the dataset's WCS information. For example, the
central RA and Dec value for each crop will be read from the
first and second calls to the ‘--coordcol’ option. The width
of the cropped box (in units of the WCS, or degrees in RA and
Dec mode) must be specified with the ‘--width’ option. You
can optionally use ‘--widthinpix’ for the value of ‘--width’
to be interpreted in pixels.
Center of a single crop (on the command-line)
You can specify the center of only one crop box with the
‘--center’ option. If it exists in the input images, it will
be cropped similar to the catalog mode, see above also for
‘--width’.
Vertices of a single crop
The ‘--polygon’ option is a high-level method to define any
convex polygon (with any number of vertices). Please see the
description of this option in *note Invoking astcrop:: for its
syntax.
*CAUTION:* In WCS mode, the image has to be aligned with the
celestial coordinates, such that the first FITS axis is parallel
(opposite direction) to the Right Ascension (RA) and the second
FITS axis is parallel to the declination. If these conditions are
not met for an image, Crop will warn you and abort. You can use
Warp to align the input image to standard celestial coordinates,
see *note Warp::.
As a summary, if you do not specify a catalog, you have to define the
cropped region manually on the command-line. In any case the mode is
mandatory for Crop to be able to interpret the values given as
coordinates or widths.
File: gnuastro.info, Node: Crop section syntax, Next: Blank pixels, Prev: Crop modes, Up: Crop
6.1.2 Crop section syntax
-------------------------
When in image mode, one of the methods to crop only one rectangular
section from the input image is to use the ‘--section’ option. Crop has
a powerful syntax to read the box parameters from a string of
characters. If you leave certain parts of the string to be empty, Crop
can fill them for you based on the input image sizes.
To define a box, you need the coordinates of two points: the first
(‘X1’, ‘Y1’) and the last pixel (‘X2’, ‘Y2’) pixel positions in the
image, or four integer numbers in total. The four coordinates can be
specified with one string in this format: '‘X1:X2,Y1:Y2’'. This string
is given to the ‘--section’ option. Therefore, the pixels along the
first axis that are $\geq$‘X1’ and $\leq$‘X2’ will be included in the
cropped image. The same goes for the second axis. Note that each
different term will be read as an integer, not a float.
The reason it only accepts integers is that ‘--section’ is a
low-level option (which is also very fast!). For a higher-level way to
specify region (any polygon, not just a box), please see the ‘--polygon’
option in *note Crop options::. Also note that in the FITS standard,
pixel indexes along each axis start from unity(1) not zero(0).
You can omit any of the values and they will be filled automatically.
The left hand side of the colon (‘:’) will be filled with ‘1’, and the
right side with the image size. So, ‘2:,:’ will include the full range
of pixels along the second axis and only those with a first axis index
larger than ‘2’ in the first axis. If the colon is omitted for a
dimension, then the full range is automatically used. So the same
string is also equal to ‘2:,’ or ‘2:’ or even ‘2’. If you want such a
case for the second axis, you should set it to: ‘,2’.
If you specify a negative value, it will be seen as before the
indexes of the image which are outside the image along the bottom or
left sides when viewed in SAO DS9. In case you want to count from the
top or right sides of the image, you can use an asterisk (‘*’). When
confronted with a ‘*’, Crop will replace it with the maximum length of
the image in that dimension. So ‘*-10:*+10,*-20:*+20’ will mean that
the crop box will be 20\times40 pixels in size and only include the top
corner of the input image with 3/4 of the image being covered by blank
pixels, see *note Blank pixels::.
If you feel more comfortable with space characters between the
values, you can use as many space characters as you wish, just be
careful to put your value in double quotes, for example,
‘--section="5:200, 123:854"’. If you forget the quotes, anything after
the first space will not be seen by ‘--section’ and you will most
probably get an error because the rest of your string will be read as a
filename (which most probably does not exist). See *note Command-line::
for a description of how the command-line works.
File: gnuastro.info, Node: Blank pixels, Next: Invoking astcrop, Prev: Crop section syntax, Up: Crop
6.1.3 Blank pixels
------------------
The cropped box can potentially include pixels that are beyond the image
range. For example, when a target in the input catalog was very near
the edge of the input image. The parts of the cropped image that were
not in the input image will be filled with the following two values
depending on the data type of the image. In both cases, SAO DS9 will
not color code those pixels.
• If the data type of the image is a floating point type (float or
double), IEEE NaN (Not a number) will be used.
• For integer types, pixels out of the image will be filled with the
value of the ‘BLANK’ keyword in the cropped image header. The
value assigned to it is the lowest value possible for that type, so
you will probably never need it any way. Only for the unsigned
character type (‘BITPIX=8’ in the FITS header), the maximum value
is used because it is unsigned, the smallest value is zero which is
often meaningful.
You can ask for such blank regions to not be included in the output
crop image using the ‘--noblank’ option. In such cases, there is no
guarantee that the image size of your outputs are what you asked for.
In some survey images, unfortunately they do not use the ‘BLANK’ FITS
keyword. Instead they just give all pixels outside of the survey area a
value of zero. So by default, when dealing with float or double image
types, any values that are 0.0 are also regarded as blank regions. This
can be turned off with the ‘--zeroisnotblank’ option.
File: gnuastro.info, Node: Invoking astcrop, Prev: Blank pixels, Up: Crop
6.1.4 Invoking Crop
-------------------
Crop will crop a region from an image. If in WCS mode, it will also
stitch parts from separate images in the input files. The executable
name is ‘astcrop’ with the following general template
$ astcrop [OPTION...] [ASCIIcatalog] ASTRdata ...
One line examples:
## Crop all objects in cat.txt from image.fits:
$ astcrop --catalog=cat.txt image.fits
## Crop all options in catalog (with RA,DEC) from all the files
## ending in `_drz.fits' in `/mnt/data/COSMOS/':
$ astcrop --mode=wcs --catalog=cat.txt /mnt/data/COSMOS/*_drz.fits
## Crop the outer 10 border pixels of the input image and give
## the output HDU a name ('EXTNAME' keyword in FITS) of 'mysection'.
$ astcrop --section=10:*-10,10:*-10 --hdu=2 image.fits \
--metaname=mysection
## Crop region around RA and Dec of (189.16704, 62.218203):
$ astcrop --mode=wcs --center=189.16704,62.218203 goodsnorth.fits
## Same crop above, but coordinates given in sexagesimal (you can
## also use ':' between the sexagesimal components).
$ astcrop --mode=wcs --center=12h36m40.08,62d13m5.53 goodsnorth.fits
## Crop region around pixel coordinate (568.342, 2091.719):
$ astcrop --mode=img --center=568.342,2091.719 --width=201 image.fits
## Crop all HDUs within a FITS file at a certain coordinate, while
## preserving the names of the HDUs in the output.
$ for hdu in $(astfits input.fits --listimagehdus); do \
astcrop input.fits --hdu=$hdu --append --output=crop.fits \
--metaname=$hdu --mode=wcs --center=189.16704,62.218203 \
--width=10/3600
done
Crop has one mandatory argument which is the input image name(s), shown
above with ‘ASTRdata ...’. You can use shell expansions, for example,
‘*’ for this if you have lots of images in WCS mode. If the crop box
centers are in a catalog, you can use the ‘--catalog’ option. In other
cases, you have to provide the single cropped output parameters must be
given with command-line options. See *note Crop output:: for how the
output file name(s) can be specified. For the full list of general
options to all Gnuastro programs (including Crop), please see *note
Common options::.
Floating point numbers can be used to specify the crop region (except
the ‘--section’ option, see *note Crop section syntax::). In such
cases, the floating point values will be used to find the desired
integer pixel indices based on the FITS standard. Hence, Crop
ultimately does not do any sub-pixel cropping (in other words, it does
not change pixel values). If you need such crops, you can use *note
Warp:: to first warp the image to the a new pixel grid, then crop from
that. For example, let's assume you want a crop from pixels 12.982 to
80.982 along the first dimension. You should first translate the image
by $-0.482$ (note that the edge of a pixel is at integer multiples of
$0.5$). So you should run Warp with ‘--translate=-0.482,0’ and then
crop the warped image with ‘--section=13:81’.
There are two ways to define the cropped region: with its center or
its vertices. See *note Crop modes:: for a full description. In the
former case, Crop can check if the central region of the cropped image
is indeed filled with data or is blank (see *note Blank pixels::), and
not produce any output when the center is blank, see the description
under ‘--checkcenter’ for more.
When in catalog mode, Crop will run in parallel unless you set
‘--numthreads=1’, see *note Multi-threaded operations::. Note that when
multiple outputs are created with threads, the outputs will not be
created in the same order. This is because the threads are asynchronous
and thus not started in order. This has no effect on each output, see
*note Reddest clumps cutouts and parallelization:: for a tutorial on
effectively using this feature.
* Menu:
* Crop options:: A list of all the options with explanation.
* Crop output:: The outputs of Crop.
* Crop known issues:: Known issues in running Crop.
File: gnuastro.info, Node: Crop options, Next: Crop output, Prev: Invoking astcrop, Up: Invoking astcrop
6.1.4.1 Crop options
....................
The options can be classified into the following contexts: Input, Output
and operating mode options. Options that are common to all Gnuastro
program are listed in *note Common options:: and will not be repeated
here.
When you are specifying the crop vertices yourself (through
‘--section’, or ‘--polygon’) on relatively small regions (depending on
the resolution of your images) the outputs from image and WCS mode can
be approximately equivalent. However, as the crop sizes get large, the
curved nature of the WCS coordinates have to be considered. For
example, when using ‘--section’, the right ascension of the bottom left
and top left corners will not be equal. If you only want regions within
a given right ascension, use ‘--polygon’ in WCS mode.
Input image parameters:
‘--hstartwcs=INT’
Specify the first keyword card (line number) to start finding the
input image world coordinate system information. This is useful
when certain header keywords of the input may cause bad conflicts
with your crop (see an example described below). To get line
numbers of the header keywords, you can pipe the fully printed
header into ‘cat -n’ like below:
$ astfits image.fits -h1 | cat -n
For example, distortions have only been present in WCSLIB from
version 5.15 (released in mid 2016). Therefore some pipelines
still apply their own specific set of WCS keywords for distortions
and put them into the image header along with those that WCSLIB
does recognize. So now that WCSLIB recognizes most of the standard
distortion parameters, they will get confused with the old ones and
give wrong results. For example, in the CANDELS-GOODS South images
that were created before WCSLIB 5.15(1).
The two ‘--hstartwcs’ and ‘--hendwcs’ are thus provided so when
using older datasets, you can specify what region in the FITS
headers you want to use to read the WCS keywords. Note that this
is only relevant for reading the WCS information, basic data
information like the image size are read separately. These two
options will only be considered when the value to ‘--hendwcs’ is
larger than that of ‘--hstartwcs’. So if they are equal or
‘--hstartwcs’ is larger than ‘--hendwcs’, then all the input
keywords will be parsed to get the WCS information of the image.
‘--hendwcs=INT’
Specify the last keyword card to read for specifying the image
world coordinate system on the input images. See ‘--hstartwcs’
Crop box parameters:
‘-c FLT[,FLT[,...]]’
‘--center=FLT[,FLT[,...]]’
The central position of the crop in the input image. The positions
along each dimension must be separated by a comma (<,>) and
fractions are also acceptable. The comma-separated values can
either be in degrees (a single number), or sexagesimal (‘_h_m_’ for
RA, ‘_d_m_’ for Dec, or ‘_:_:_’ for both).
The number of values given to this option must be the same as the
dimensions of the input dataset. The width of the crop should be
set with ‘--width’. The units of the coordinates are read based on
the value to the ‘--mode’ option, see below.
‘-O STR’
‘--mode=STR’
Mode to interpret the crop's coordinates (for example with
‘--center’, ‘--catalog’ or ‘--polygon’). The value must either be
‘img’ (to assume image/pixel coordinates) or ‘wcs’ (to assume WCS,
usually RA/Dec, coordinates), see *note Crop modes:: for a full
description.
‘-w FLT[,FLT[,...]]’
‘--width=FLT[,FLT[,...]]’
Width of the cropped region about coordinate given to ‘--center’.
If in WCS mode, value(s) given to this option will be read in the
same units as the dataset's WCS information along this dimension
(unless ‘--widthinpix’ is given). This option may take either a
single value (to be used for all dimensions: ‘--width=10’ in
image-mode will crop a $10\times10$ pixel image) or multiple values
(a specific value for each dimension: ‘--width=10,20’ in image-mode
will crop a $10\times20$ pixel image).
The ‘--width’ option also accepts fractions. For example, if you
want the width of your crop to be 3 by 5 arcseconds along RA and
Dec respectively and you are in wcs-mode, you can use:
‘--width=3/3600,5/3600’.
The final output will have an odd number of pixels to allow easy
identification of the pixel which keeps your requested coordinate
(from ‘--center’ or ‘--catalog’). If you want an even sided crop,
you can run Crop afterwards with ‘--section=":*-1,:*-1"’ or
‘--section=2:,2:’ (depending on which side you do not need), see
*note Crop section syntax::.
The basic reason for making an odd-sided crop is that your given
central coordinate will ultimately fall within a discrete pixel in
the image (defined by the FITS standard). When the crop has an odd
number of pixels in each dimension, that pixel can be very well
defined as the "central" pixel of the crop, making it unambiguously
easy to identify. However, for an even-sided crop, it will be very
hard to identify the central pixel (it can be on any of the four
pixels adjacent to the central point of the image!).
‘-X’
‘--widthinpix’
In WCS mode, interpret the value to ‘--width’ as number of pixels,
not the WCS units like degrees. This is useful when you want a
fixed crop size in pixels, even though your center coordinates are
in WCS (for example, RA and Dec).
‘-l STR’
‘-l FLT:FLT,...’
‘--polygon=STR’
‘--polygon=FLT,FLT:FLT,FLT:...’
Polygon vertice coordinates (when value is in ‘FLT,FLT:FLT,FLT:...’
format) or the filename of a SAO DS9 region file (when the value
has no ‘,’ or ‘:’ characters). Each vertice can either be in
degrees (a single floating point number) or sexagesimal (in formats
of '‘_h_m_’' for RA and '‘_d_m_’' for Dec, or simply '‘_:_:_’' for
either of them).
The vertices are used to define the polygon: in the same order
given to this option. When the vertices are not necessarily
ordered in the proper order (for example, one vertice in a square
comes after its diagonal opposite), you can add the ‘--polygonsort’
option which will attempt to sort the vertices before cropping.
Note that for concave polygons, sorting is not recommended because
there is no unique solution, for more, see the description under
‘--polygonsort’.
This option can be used both in the image and WCS modes, see *note
Crop modes::. If a SAO DS9 region file is used, the coordinate
mode of Crop will be determined by the contents of the file and any
value given to ‘--mode’ is ignored. The cropped image will be the
size of the rectangular region that completely encompasses the
polygon. By default all the pixels that are outside of the polygon
will be set as blank values (see *note Blank pixels::). However,
if ‘--polygonout’ is called all pixels internal to the vertices
will be set to blank. In WCS-mode, you may provide many FITS
images/tiles: Crop will stitch them to produce this cropped region,
then apply the polygon.
The syntax for the polygon vertices is similar to, and simpler
than, that for ‘--section’. In short, the dimensions of each
coordinate are separated by a comma (<,>) and each vertex is
separated by a colon (<:>). You can define as many vertices as you
like. If you would like to use space characters between the
dimensions and vertices to make them more human-readable, then you
have to put the value to this option in double quotation marks.
For example, let's assume you want to work on the deepest part of
the WFC3/IR images of Hubble Space Telescope eXtreme Deep Field
(HST-XDF). According to the web page
(https://archive.stsci.edu/prepds/xdf/)(2) the deepest part is
contained within the coordinates:
[ (53.187414,-27.779152), (53.159507,-27.759633),
(53.134517,-27.787144), (53.161906,-27.807208) ]
They have provided mask images with only these pixels in the
WFC3/IR images, but what if you also need to work on the same
region in the full resolution ACS images? Also what if you want to
use the CANDELS data for the shallow region? Running Crop with
‘--polygon’ will easily pull out this region of the image for you,
irrespective of the resolution. If you have set the operating mode
to WCS mode in your nearest configuration file (see *note
Configuration files::), there is no need to call ‘--mode=wcs’ on
the command-line.
$ astcrop --mode=wcs desired-filter-image(s).fits \
--polygon="53.187414,-27.779152 : 53.159507,-27.759633 : \
53.134517,-27.787144 : 53.161906,-27.807208"
More generally, you have an image and want to define the polygon
yourself (it is not already published like the example above). As
the number of vertices increases, checking the vertex coordinates
on a FITS viewer (for example, SAO DS9) and typing them in, one by
one, can be very tedious and prone to typo errors. In such cases,
you can make a polygon "region" in DS9 and using your mouse, easily
define (and visually see) it. Given that SAO DS9 has a graphic
user interface (GUI), if you do not have the polygon vertices
before-hand, it is much more easier build your polygon there and
pass it onto Crop through the region file.
You can take the following steps to make an SAO DS9 region file
containing your polygon. Open your desired FITS image with SAO DS9
and activate its "region" mode with Edit→Region. Then define the
region as a polygon with Region→Shape→Polygon. Click on the
approximate center of the region you want and a small square will
appear. By clicking on the vertices of the square you can shrink
or expand it, clicking and dragging anywhere on the edges will
enable you to define a new vertex. After the region has been
nicely defined, save it as a file with Region→"Save Regions". You
can then select the name and address of the output file, keep the
format as ‘REG (*.reg)’ and press the "OK" button. In the next
window, keep format as "ds9" and "Coordinate System" as "fk5" for
RA and Dec (or "Image" for pixel coordinates). A plain text file
is now created (let's call it ‘ds9.reg’) which you can pass onto
Crop with ‘--polygon=ds9.reg’.
For the expected format of the region file, see the description of
‘gal_ds9_reg_read_polygon’ in *note SAO DS9 library::. However,
since SAO DS9 makes this file for you, you do not usually need to
worry about its internal format unless something un-expected
happens and you find a bug.
‘--polygonout’
Keep all the regions outside the polygon and mask the inner ones
with blank pixels (see *note Blank pixels::). This is practically
the inverse of the default mode of treating polygons. Note that
this option only works when you have only provided one input image.
If multiple images are given (in WCS mode), then the full area
covered by all the images has to be shown and the polygon excluded.
This can lead to a very large area if large surveys like COSMOS are
used. So Crop will abort and notify you. In such cases, it is
best to crop out the larger region you want, then mask the smaller
region with this option.
‘--polygonsort’
Sort the given set of vertices to the ‘--polygon’ option. For a
concave polygon it will sort the vertices correctly, however for a
convex polygon it there is no unique sorting, so be careful because
the crop may not be what you expected.
Polygons come in two classes: convex and concave (or generally,
non-convex!), see below for a demonstration. Convex polygons are
those where all inner angles are less than 180 degrees. By
contrast, a concave polygon is one where an inner angle may be more
than 180 degrees.
Concave Polygon Convex Polygon
D --------C D------------- C
\ | E / |
\E | \ |
/ | \ |
A--------B A ----------B
‘-s STR’
‘--section=STR’
Section of the input image which you want to be cropped. See *note
Crop section syntax:: for a complete explanation on the syntax
required for this input.
‘-C FITS/TXT’
‘--catalog=FITS/TXT’
File name of catalog for making multiple crops from the input
images/cubes. The catalog can be in any of Gnuastro's recognized
*note Recognized table formats::. The columns containing the
coordinates for the crop centers can be specified with the
‘--coordcol’ option (using column names or numbers, see *note
Selecting table columns::). The catalog can also contain the name
of each crop, you can specify the column containing the name with
the ‘--namecol’.
‘--cathdu=STR/INT’
The HDU (extension) containing the catalog (if the file given to
‘--catalog’ is a FITS file). This can either be the HDU name (if
it has one) or number (counting from 0). By default (if this
option is not given), the second HDU will be used (equivalent to
‘--cathdu=1’. For more on how to specify the HDU, see the
explanation of the ‘--hdu’ option in *note Input output options::.
‘-x STR/INT’
‘--coordcol=STR/INT’
The column in a catalog to read as a coordinate. The value can be
either the column number (starting from 1), or a match/search in
the table meta-data, see *note Selecting table columns::. This
option must be called multiple times, depending on the number of
dimensions in the input dataset. If it is called more than
necessary, the extra columns (later calls to this option on the
command-line or configuration files) will be ignored, see *note
Configuration file precedence::.
‘-n STR/INT’
‘--namecol=STR/INT’
Column selection of crop file name. The value can be either the
column number (starting from 1), or a match/search in the table
meta-data, see *note Selecting table columns::. This option can be
used both in Image and WCS modes, and not a mandatory. When a
column is given to this option, the final crop base file name will
be taken from the contents of this column. The directory will be
determined by the ‘--output’ option (current directory if not
given) and the value to ‘--suffix’ will be appended. When this
column is not given, the row number will be used instead.
---------- Footnotes ----------
(1) <https://archive.stsci.edu/pub/hlsp/candels/goods-s/gs-tot/v1.0/>
(2) <https://archive.stsci.edu/prepds/xdf/>
File: gnuastro.info, Node: Crop output, Next: Crop known issues, Prev: Crop options, Up: Invoking astcrop
6.1.4.2 Crop output
...................
The string given to ‘--output’ option will be interpreted depending on
how many crops were requested, see *note Crop modes:::
• When a catalog is given, the value of the ‘--output’ (see *note
Common options::) will be read as the directory to store the output
cropped images. Hence if it does not already exist, Crop will
abort with an "No such file or directory" error.
The crop file names will consist of two parts: a variable part (the
row number of each target starting from 1) along with a fixed
string which you can set with the ‘--suffix’ option. Optionally,
you may also use the ‘--namecol’ option to define a column in the
input catalog to use as the file name instead of numbers.
• When only one crop is desired, the value to ‘--output’ will be read
as a file name. If no output is specified or if it is a directory,
the output file name will follow the automatic output names of
Gnuastro, see *note Automatic output::: The string given to
‘--suffix’ will be replaced with the ‘.fits’ suffix of the input.
By default, as suggested by the FITS standard and implemented in all
Gnuastro programs, the first/primary extension of the output files will
only contain metadata. The cropped images/cubes will be written into
the 2nd HDU of their respective FITS file (which is actually counted as
‘1’ because HDU counting starts from ‘0’). However, if you want the
cropped data to be written into the primary (0-th) HDU, run Crop with
the ‘--primaryimghdu’ option.
If the output file already exists by default Crop will re-write it
(so that all existing HDUs in it will be deleted). If you want the
cropped HDU to be appended to existing HDUs, use ‘--append’ described
below.
The 0-th HDU of each output cropped image will contain the names of
the input image(s) it was cut from. If a name is longer than the 70
character space that the FITS standard allows for header keyword values,
the name will be cut into several keywords from the nearest slash (</>).
The keywords have the following format: ‘ICFn_m’ (for Crop File). Where
‘n’ is the number of the image used in this crop and ‘m’ is the part of
the name (it can be broken into multiple keywords). Following the name
is another keyword named ‘ICFnPIX’ which shows the pixel range from that
input image in the same syntax as *note Crop section syntax::. So this
string can be directly given to the ‘--section’ option later.
Once done, a log file can be created in the current directory with
the ‘--log’ option. This file will have three columns and the same
number of rows as the number of cropped images. There are also comments
on the top of the log file explaining basic information about the run
and descriptions for the columns. A short description of the columns is
also given below:
1. The cropped image file name for that row.
2. The number of input images that were used to create that image.
3. A ‘0’ if the central few pixels (value to the ‘--checkcenter’
option) are blank and ‘1’ if they are not. When the crop was not
defined by its center (see *note Crop modes::), or ‘--checkcenter’
was given a value of 0 (see *note Invoking astcrop::), the center
will not be checked and this column will be given a value of ‘-1’.
If the output crop(s) have a single element (pixel in an image) and
‘--oneelemstdout’ has been called, no output file will be produced!
Instead, the single element's value is printed on the standard output.
See the description of ‘--oneelemstdout’ below for more:
‘-p STR’
‘--suffix=STR’
The suffix (or post-fix) of the output files for when you want all
the cropped images to have a special ending. One case where this
might be helpful is when besides the science images, you want the
weight images (or exposure maps, which are also distributed with
survey images) of the cropped regions too. So in one run, you can
set the input images to the science images and ‘--suffix=_s.fits’.
In the next run you can set the weight images as input and
‘--suffix=_w.fits’.
‘-a STR’
‘--metaname=STR’
Name of cropped HDU (value to the ‘EXTNAME’ keyword of FITS). If
not given, a default ‘CROP’ will be placed there (so the ‘EXTNAME’
keyword will always be present in the output). If crop produces
many outputs from a catalog, they will be given the same string as
‘EXTNAME’ (the file names containing the cropped HDU will be
different).
‘-A’
‘--append’
If the output file already exists, append the cropped image HDU to
the end of any existing HDUs. By default (when this option isn't
given), if an output file already exists, any existing HDU in it
will be deleted. If the output file doesn't exist, this option is
redundant.
‘--primaryimghdu’
Write the output into the primary (0-th) HDU/extension of the
output. By default, like all Gnuastro's default outputs, no data
is written in the primary extension because the FITS standard
suggests keeping that extension free of data and only for metadata.
‘-t’
‘--oneelemstdout’
When a crop only has a single element (a single pixel), print it to
the standard output instead of making a file. By default (without
this option), a single-pixel crop will be saved to a file, just
like a crop of any other size.
When a single crop is requested (either through ‘--center’, or a
catalog of one row is given), the single value alone is printed
with nothing else. This makes it easy to immediately write the
value into a shell variable for example:
value=$(astcrop img.fits --mode=wcs --center=1.234,5.678 \
--width=1 --widthinpix --oneelemstdout \
--quiet)
If a catalog of coordinates is given (that would produce multiple
crops; or multiple values in this scenario), the solution for a
single value will not work! Recall that Crop will do the crops in
parallel, therefore each time you run it, the order of the rows
will be different and not correspond to the order of the inputs.
To allow identification of each value (which row of the input
catalog it corresponds to), Crop will first print the name of the
would-be created file name, and print the value after it (separated
by an empty SPACE character). In other words, the file in the
first column will not actually be created, but the value of the
pixel it would have contained (if this option was not called) is
printed after it.
‘-c FLT/INT’
‘--checkcenter=FLT/INT’
Square box width of region in the center of the image to check for
blank values. If any of the pixels in this central region of a
crop (defined by its center) are blank, then it will not be stored
in an output file. If the value to this option is zero, no
checking is done. This check is only applied when the cropped
region(s) are defined by their center (not by the vertices, see
*note Crop modes::).
The units of the value are interpreted based on the ‘--mode’ value
(in WCS or pixel units). The ultimate checked region size (in
pixels) will be an odd integer around the center (converted from
WCS, or when an even number of pixels are given to this option).
In WCS mode, the value can be given as fractions, for example, if
the WCS units are in degrees, ‘0.1/3600’ will correspond to a check
size of 0.1 arcseconds.
Because survey regions do not often have a clean square or
rectangle shape, some of the pixels on the sides of the survey FITS
image do not commonly have any data and are blank (see *note Blank
pixels::). So when the catalog was not generated from the input
image, it often happens that the image does not have data over some
of the points.
When the given center of a crop falls in such regions or outside
the dataset, and this option has a non-zero value, no crop will be
created. Therefore with this option, you can specify a width of a
small box (3 pixels is often good enough) around the central pixel
of the cropped image. You can check which crops were created and
which were not from the command-line (if ‘--quiet’ was not called,
see *note Operating mode options::), or in Crop's log file (see
*note Crop output::).
‘-b’
‘--noblank’
Pixels outside of the input image that are in the crop box will not
be used. By default they are filled with blank values (depending
on type), see *note Blank pixels::. This option only applies only
in Image mode, see *note Crop modes::.
‘-z’
‘--zeroisnotblank’
In float or double images, it is common to give the value of zero
to blank pixels. If the input image type is one of these two
types, such pixels will also be considered as blank. You can
disable this behavior with this option, see *note Blank pixels::.
File: gnuastro.info, Node: Crop known issues, Prev: Crop output, Up: Invoking astcrop
6.1.4.3 Crop known issues
.........................
When running Crop, you may encounter strange errors and bugs. In these
cases, please report a bug and we will try to fix it as soon as
possible, see *note Report a bug::. However, some things are beyond our
control, or may take too long to fix directly. In this section we list
such known issues that may occur in known cases and suggest the hack (or
work-around) to fix the problem:
Crash with ‘Killed’ when cropping catalog from ‘.fits.gz’
This happens because CFISTIO (that reads and writes FITS files)
will internally decompress the file in a temporary place (possibly
in the RAM), then start reading from it. On the other hand, by
default when given a catalog (with many crops) and not specifying
‘--numthreads’, Crop will use the maximum number of threads
available on your system to do each crop faster. On an normal (not
compressed) file, parallel access will not cause a problem,
however, when attempting parallel access with the maximum number of
threads on a compressed file, CFITSIO crashes with ‘Killed’.
Therefore the following solutions can be used to fix this crash:
• Decrease the number of threads (at the minimum, set
‘--numthreads=1’). Since this solution does not attempt to
change any of your previous Crop command components or does
not change your local file structure, it is the preferred way.
• Decompress the file (with the command below) and feed the
‘.fits’ file into Crop without changing the number of threads.
$ gunzip -k image.fits.gz
File: gnuastro.info, Node: Arithmetic, Next: Convolve, Prev: Crop, Up: Data manipulation
6.2 Arithmetic
==============
It is commonly necessary to do operations on some or all of the elements
of a dataset independently (pixels in an image). For example, in the
reduction of raw data it is necessary to subtract the Sky value (*note
Sky value::) from each image image. Later (once the images as warped
into a single grid using Warp for example, see *note Warp::), the images
are co-added (the output pixel grid is the average of the pixels of the
individual input images). Arithmetic is Gnuastro's program for such
operations on your datasets directly from the command-line. It
currently uses the reverse polish or post-fix notation, see *note
Reverse polish notation:: and will work on the native data types of the
input images/data to reduce CPU and RAM resources, see *note Numeric
data types::. For more information on how to run Arithmetic, please see
*note Invoking astarithmetic::.
* Menu:
* Reverse polish notation:: The current notation style for Arithmetic.
* Integer benefits and pitfalls:: Integers have benefits, but require care.
* Noise basics:: Introduction various noise models.
* Arithmetic operators:: List of operators known to Arithmetic.
* Invoking astarithmetic:: How to run Arithmetic: options and output.
File: gnuastro.info, Node: Reverse polish notation, Next: Integer benefits and pitfalls, Prev: Arithmetic, Up: Arithmetic
6.2.1 Reverse polish notation
-----------------------------
The most common notation for arithmetic operations is the infix notation
(https://en.wikipedia.org/wiki/Infix_notation) where the operator goes
between the two operands, for example, $4+5$. The infix notation is the
preferred way in most programming languages which come with scripting
features for large programs. This is because the infix notation
requires a way to define precedence when more than one operator is
involved.
For example, consider the statement ‘5 + 6 / 2’. Should 6 first be
divided by 2, then added by 5? Or should 5 first be added with 6, then
divided by 2? Therefore we need parenthesis to show precedence:
‘5+(6/2)’ or ‘(5+6)/2’. Furthermore, if you need to leave a value for
later processing, you will need to define a variable for it; for
example, ‘a=(5+6)/2’.
Gnuastro provides libraries where you can also use infix notation in
C or C++ programs. However, Gnuastro's programs are primarily designed
to be run on the command-line and the level of complexity that infix
notation requires can be annoying/confusing to write on the command-line
(where they can get confused with the shell's parenthesis or variable
definitions). Therefore Gnuastro's Arithmetic and Table (when doing
column arithmetic) programs use the post-fix notation, also known as
reverse polish notation
(https://en.wikipedia.org/wiki/Reverse_Polish_notation). For example,
instead of writing ‘5+6’, we write ‘5 6 +’.
The Wikipedia article on the reverse polish notation provides some
excellent explanation on this notation but here we will give a short
summary here for self-sufficiency. In short, in the reverse polish
notation, the operator is placed after the operands. As we will see
below this removes the need to define parenthesis and lets you use
previous values without needing to define a variable. In the future(1)
we do plan to also optionally allow infix notation when arithmetic
operations on datasets are desired, but due to time constraints on the
developers we cannot do it immediately.
To easily understand how the reverse polish notation works, you can
think of each operand (‘5’ and ‘6’ in the example above) as a node in a
"last-in-first-out" stack. One such stack in daily life is a stack of
dishes in the kitchen: you put a clean dish, on the top of a stack of
dishes when it is ready for later usage. Later, when you need a dish,
you pick the top one (hence the "last" dish placed "in" the stack is the
"first" dish that comes "out" when necessary).
Each operator will need a certain number of operands (in the example
above, the ‘+’ operator needs two operands: ‘5’ and ‘6’). In the
kitchen metaphor, an operator can be an oven. Every time an operator is
confronted, the operator takes (or "pops") the number of operands it
needs from the top of the stack (so they do not exist in the stack any
more), does its operation, and places (or "pushes") the result back on
top of the stack. So if you want the average of 5 and 6, you would
write: ‘5 6 + 2 /’. The operations that are done are:
1. ‘5’ is an operand, so Arithmetic pushes it to the top of the stack
(which is initially empty). In the kitchen metaphor, you can
visualize this as taking a new dish from the cabinet, putting the
number 5 inside of the dish, and putting the dish on top of the
(empty) cooking table in front of you. You now have a stack of one
dish on the table in front of you.
2. ‘6’ is also an operand, so it is pushed to the top of the stack.
Like before, you can visualize this as taking a new dish from the
cabinet, putting the number 6 in it and placing it on top of the
previous dish. You now have a stack of two dishes on the table in
front of you.
3. ‘+’ is a _binary_ operator, so it will pop the top two elements of
the stack out of it, and perform addition on them (the order is
$5+6$ in the example above). The result is ‘11’ which is pushed to
the top of the stack.
To visualize this, you can think of the ‘+’ operator as an oven
with a place for two dishes. You pick up the top-most dish (that
has the number 6 in it) and put it in the oven. The top dish is
now the one that has the number 5. You also pick it up and put it
in the oven, and close the oven door. When the oven has finished
its cooking, it produces a single output (in one dish, with the
number 11 inside of it). You take that output dish and put it back
on the table. You now have a stack of one dish on the table in
front of you.
4. ‘2’ is an operand so push it onto the top of the stack. In the
kitchen metaphor, you again go to the cabinet, pick up a dish and
put the number 2 inside of it and put the dish over the previous
dish (that has the number 11). You now have a stack of two dishes
on the table in front of you.
5. ‘/’ (division) is a binary operator, so pull out the top two
elements of the stack (top-most is ‘2’, then ‘11’) and divide the
second one by the first. In the kitchen metaphor, the ‘/’ operator
can be visualized as a microwave that takes two dishes. But unlike
the oven (‘+’ operator) before, the order of inputs matters (they
are on top of each other: with the top dish holder being the
numerator and the bottom one being the denominator). Again, you
look at your stack of dishes on the table.
You pick up the top one (with value 2 inside of it) and put it in
the microwave's bottom (denominator) dish holder. Then you go back
to your stack of dishes on the table and pick up the top dish (with
value 11 inside of it) and put that in the top (nominator) dish
holder. The microwave will do its work and when it is finished,
returns a new dish with the single value 5.5 inside of it. You
pick up the dish from the microwave and place it back on the table.
6. There are no more operands or operators, so simply return the
remaining operand in the output. In the kitchen metaphor, you see
that your recipe has no more steps, so you just pick up the
remaining dish and take it to the dining room to enjoy a good
dinner.
In the Arithmetic program, the operands can be FITS images of any
dimensionality, or numbers (see *note Invoking astarithmetic::). In
Table's column arithmetic, they can be any column in the table (a series
of numbers in an array) or a single number (see *note Column
arithmetic::).
With this notation, very complicated procedures can be created
without the need for parenthesis or worrying about precedence. Even
functions which take an arbitrary number of arguments can be defined in
this notation. This is a very powerful notation and is used in
languages like Postscript (2) which produces PDF files when compiled.
---------- Footnotes ----------
(1) <https://savannah.gnu.org/task/index.php?13867>
(2) See the EPS and PDF part of *note Recognized file formats:: for a
little more on the Postscript language.
File: gnuastro.info, Node: Integer benefits and pitfalls, Next: Noise basics, Prev: Reverse polish notation, Up: Arithmetic
6.2.2 Integer benefits and pitfalls
-----------------------------------
Integers are the simplest numerical data types (*note Numeric data
types::). Because of this, their storage space is much less, and their
processing is much faster than floating point types. You can confirm
this on your computer with the series of commands below. You will make
four 5000 by 5000 pixel images filled with random values. Two of them
will be saved as signed 8-bit integers, and two with 64-bit floating
point types. The last command prints the size of the created images.
$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma int8 -oint-1.fits
$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma int8 -oint-2.fits
$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma float64 -oflt-1.fits
$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma float64 -oflt-2.fits
$ ls -lh int-*.fits flt-*.fits
The 8-bit integer images are only 24MB, while the 64-bit floating
point images are 191 MB! Besides helping in storage (on your disk, or in
RAM, while the program is running), the small size of these files also
helps in faster reading of the inputs. Furthermore, CPUs can process
integer operations much faster than floating points. In the integers,
the ones with a smaller width (number of bits) can be processed much
faster. You can see this with the two commands below where you will add
the integer images with each other and the floats with each other:
$ astarithmetic flt-1.fits flt-2.fits + -oflt-sum.fits -g1
$ astarithmetic int-1.fits int-2.fits + -oint-sum.fits -g1
Have a look at the running time of the two commands above (that is
printed on their last line). On the system that this paragraph was
written on, the floating point and integer image sums were respectively
done in 0.481 and 0.089 seconds (the integer operation was almost 5
times faster!).
*If your data does not have decimal points, use integer types:* integer
types are much faster and can take much less space in your storage or
RAM (while the program is running).
*Select the smallest width that can host the range/precision of values:*
for example, if the largest possible value in your dataset is 1000 and
all numbers are integers, store it as a 16-bit integer. Also, if you
know the values can never become negative, store it as an unsigned
16-bit integer. For floating point types, if you know you will not need
a precision of more than 6 significant digits, use the 32-bit floating
point type. For more on the range (for integers) and precision (for
floats), see *note Numeric data types::.
There is a price to be paid for this improved efficiency in integers:
your wisdom! If you have not selected your types wisely, strange
situations may happen. For example, try the command below:
$ astarithmetic 125 10 +
You expect the output to be $135$, but it will be $-121$! The reason is
that when Arithmetic (or column-arithmetic in Table) confronts a number
on the command-line, it use the principles above to select the most
efficient type for each number. Both $125$ and $10$ can safely fit
within a signed, 8-bit integer type, so arithmetic will store both as an
8-bit integer. However, the sum ($135$) is larger than the maximum
possible value of an 8-bit signed integer ($127$). Therefore an integer
overflow will occur, and the bits will be over-written. As a result,
the value will be $135-128=7$ more than the minimum value of this type
($-128$), which is $-128+7=-121$.
When you know situations like this may occur, you can simply use
*note Numerical type conversion operators::, to set just one of the
inputs to a wider data type (the smallest, wider type to avoid wasting
resources). In the example above, this would be ‘uint16’:
$ astarithmetic 125 uint16 10 +
The reason this worked is that $125$ is now converted into an
unsigned 16-bit integer before the ‘+’ operator. Since this is larger
than an 8-bit integer, the C programming language's automatic type
conversion will treat both as the wider type and store the result of the
binary operation (‘+’) in that type.
For such a basic operation like the command above, a faster hack
would be any of the two commands below (which are equivalent). This is
because ‘125.0’ or ‘125.’ are interpreted as floating-point types and
they do not suffer from such issues (converting only on one input is
enough):
$ astarithmetic 125. 10 +
$ astarithmetic 125.0 10 +
For this particular command, the fix above will be as fast as the
‘uint16’ solution. This is because there are only two numbers, and the
overhead of Arithmetic (reading configuration files, etc.) dominates
the running time. However, for large datasets, the ‘uint16’ solution
will be faster (as you saw above), Arithmetic will consume less RAM
while running, and the output will consume less storage in your system
(all major benefits)!
It is possible to do internal checks in Gnuastro and catch integer
overflows and correct them internally. However, we have not opted for
this solution because all those checks will consume significant
resources and slow down the program (especially with large datasets
where RAM, storage and running time become important). To be optimal,
we therefore trust that you (the wise Gnuastro user!) make the
appropriate type conversion in your commands where necessary (recall
that the operators are available in *note Numerical type conversion
operators::).
File: gnuastro.info, Node: Noise basics, Next: Arithmetic operators, Prev: Integer benefits and pitfalls, Up: Arithmetic
6.2.3 Noise basics
------------------
Deep astronomical images, like those used in extragalactic studies,
seriously suffer from noise in the data. Generally speaking, the
sources of noise in an astronomical image are photon counting noise and
Instrumental noise which are discussed in *note Photon counting noise::
and *note Instrumental noise::. This review finishes with *note
Generating random numbers:: which is a short introduction on how random
numbers are generated. We will see that while software random number
generators are not perfect, they allow us to obtain a reproducible
series of random numbers through setting the random number generator
function and seed value. Therefore in this section, we will also
discuss how you can set these two parameters in Gnuastro's programs
(including the arithmetic operators in *note Random number
generators::).
* Menu:
* Photon counting noise:: Poisson noise
* Instrumental noise:: Readout, dark current and other sources.
* Final noised pixel value:: How the final noised value is calculated.
* Generating random numbers:: How random numbers are generated.
File: gnuastro.info, Node: Photon counting noise, Next: Instrumental noise, Prev: Noise basics, Up: Noise basics
6.2.3.1 Photon counting noise
.............................
With the very accurate electronics used in today's detectors, photon
counting noise(1) is the most significant source of uncertainty in most
datasets. To understand this noise (error in counting) and its effect
on the images of astronomical targets, let's start by reviewing how a
distribution produced by counting can be modeled as a parametric
function.
Counting is an inherently discrete operation, which can only produce
positive integer outputs (including zero). For example, we cannot count
$3.2$ or $-2$ of anything. We only count $0$, $1$, $2$, $3$ and so on.
The distribution of values, as a result of counting efforts is formally
known as the Poisson distribution
(https://en.wikipedia.org/wiki/Poisson_distribution). It is associated
to Siméon Denis Poisson, because he discussed it while working on the
number of wrongful convictions in court cases in his 1837 book(2).
Let's take $\lambda$ to represent the expected mean count of
something. Furthermore, let's take $k$ to represent the output of a
counting attempt (hence $k$ is a positive integer). The probability
density function of getting $k$ counts (in each attempt, given the
expected/mean count of $\lambda$) can be written as:
$$f(k)={\lambda^k \over k!} e^{-\lambda},\quad k\in {0, 1, 2, 3, \dots }$$
Because the Poisson distribution is only applicable to positive
integer values (note the factorial operator, which only applies to
non-negative integers), naturally it is very skewed when $\lambda$ is
near zero. One qualitative way to understand this behavior is that for
smaller values near zero, there simply are not enough integers smaller
than the mean, than integers that are larger. Therefore to accommodate
all possibilities/counts, it has to be strongly skewed to the positive
when the mean is small. For more on Skewness, see *note Skewness caused
by signal and its measurement::.
As $\lambda$ becomes larger, the distribution becomes more and more
symmetric, and the variance of that distribution is equal to its mean.
In other words, the standard deviation is the square root of the mean.
It can also be proved that when the mean is large, say $\lambda>1000$,
the Poisson distribution approaches the Normal (Gaussian) distribution
(https://en.wikipedia.org/wiki/Normal_distribution) with mean
$\mu=\lambda$ and standard deviation $\sigma=\sqrt{\lambda}$. In other
words, a Poisson distribution (with a sufficiently large $\lambda$) is
simply a Gaussian that has one free parameter ($\mu=\lambda$ and
$\sigma=\sqrt{\lambda}$), instead of the two parameters that the
Gaussian distribution originally has (independent $\mu$ and $\sigma$).
In real situations, the photons/flux from our targets are combined
with photons from a certain background (observationally, the _Sky_
value). The Sky value is defined to be the average flux of a region in
the dataset with no targets. Its physical origin can be the brightness
of the atmosphere (for ground-based instruments), possible stray light
within the imaging instrument, the average flux of undetected targets,
etc. The Sky value is thus an ideal definition, because in real
datasets, what lies deep in the noise (far lower than the detection
limit) is never known(3). To account for all of these, the sky value is
defined to be the average count/value of the undetected regions in the
image. In a mock image/dataset, we have the luxury of setting the
background (Sky) value.
In summary, the value in each element of the dataset (pixel in an
image) is the sum of contributions from various galaxies and stars
(after convolution by the PSF, see *note PSF::). Let's name the
convolved sum of possibly overlapping objects in each pixel as $I_{nn}$.
$nn$ represents 'no noise'. For now, let's assume the background ($B$)
is constant and sufficiently high for the Poisson distribution to be
approximated by a Gaussian. Then the flux of that pixel, after adding
noise, is _a random value_ taken from a Gaussian distribution with the
following mean ($\mu$) and standard deviation ($\sigma$):
$$\mu=B+I_{nn}, \quad \sigma=\sqrt{B+I_{nn}}$$
In astronomical instruments, $B$ is enhanced by adding a "bias" level
to each pixel before the shutter is even opened (for the exposure to
start). As the exposure is ongoing and photo-electrons are accumulating
from the astronomical objects, a "dark" current (due to thermal
radiation of the instrument) also builds up in the pixels. The "dark"
current will accumulate even when the shutter is closed, but the CCD
electronics are working (hence the name "dark"). This added dark level
further enhances the mean value in a real observation compared to the
raw background value (from the atmosphere for example).
Since this type of noise is inherent in the objects we study, it is
usually measured on the same scale as the astronomical objects, namely
the magnitude system, see *note Brightness flux magnitude::. It is then
internally converted to the flux scale for further processing.
The equations above clearly show the importance of the background
value and its effect on the final signal to noise ratio in each pixel of
a science image. It is therefore, one of the most important factors in
understanding the noise (and properly simulating observations where
necessary). An inappropriately bright background value can hide the
signal of the mock profile hide behind the noise. In other words, a
brighter background has larger standard deviation and vice versa. As a
result, the only necessary parameter to define photon-counting noise
over a mock image of simulated profiles is the background. For a
complete example, see *note Sufi simulates a detection::.
To better understand the correlation between the mean (or background)
value and the noise standard deviation, let's use an analogy. Consider
the profile of your galaxy to be analogous to the profile of a ship that
is sailing in the sea. The height of the ship would therefore be
analogous to the maximum flux difference between your galaxy's minimum
and maximum values. Furthermore, let's take the depth of the sea to
represent the background value: a deeper sea, corresponds to a brighter
background. In this analogy, the "noise" would be the height of the
waves that surround the ship: in deeper waters, the waves would also be
taller (the square root of the mean depth at the ship's position).
If the ship is in deep waters, the height of waves are greater than
when the ship is near to the beach (at lower depths). Therefore, when
the ship is in the middle of the sea, there are high waves that are
capable of hiding a significant part of the ship from our perspective.
This corresponds to a brighter background value in astronomical images:
the resulting noise from that brighter background can completely wash
out the signal from a fainter galaxy, star or solar system object.
---------- Footnotes ----------
(1) In practice, we are actually counting the electrons that are
produced by each photon, not the actual photons.
(2) [From Wikipedia] Poisson's result was also derived in a previous
study by Abraham de Moivre in 1711. Therefore some people suggest it
should rightly be called the de Moivre distribution.
(3) In a real image, a relatively large number of very faint objects
can be fully buried in the noise and never detected. These undetected
objects will bias the background measurement to slightly larger values.
Our best approximation is thus to simply assume they are uniform, and
consider their average effect. See Figure 1 (a.1 and a.2) and Section
2.2 in Akhlaghi and Ichikawa 2015 (https://arxiv.org/abs/1505.01664).
File: gnuastro.info, Node: Instrumental noise, Next: Final noised pixel value, Prev: Photon counting noise, Up: Noise basics
6.2.3.2 Instrumental noise
..........................
While taking images with a camera, a bias current is fed to the pixels,
the variation of the value of this bias current over the pixels, also
adds to the final image noise. Another source of noise is the readout
noise that is produced by the electronics in the detector.
Specifically, the parts that attempt to digitize the voltage produced by
the photo-electrons in the analog to digital converter. With the
current generation of instruments, this source of noise is not as
significant as the noise due to the background Sky discussed in *note
Photon counting noise::.
Let $C$ represent the combined standard deviation of all these
instrumental sources of noise. When only this source of noise is
present, the noised pixel value would be a random value chosen from a
Gaussian distribution with
$$\mu=I_{nn}, \quad \sigma=\sqrt{C^2+I_{nn}}$$
This type of noise is independent of the signal in the dataset, it is
only determined by the instrument. So the flux scale (and not magnitude
scale) is most commonly used for this type of noise. In practice, this
value is usually reported in analog-to-digital units or ADUs, not flux
or electron counts. The gain value of the device can be used to convert
between these two, see *note Brightness flux magnitude::.
File: gnuastro.info, Node: Final noised pixel value, Next: Generating random numbers, Prev: Instrumental noise, Up: Noise basics
6.2.3.3 Final noised pixel value
................................
Based on the discussions in *note Photon counting noise:: and *note
Instrumental noise::, depending on the values you specify for $B$ and
$C$ from the above, the final noised value for each pixel is a random
value chosen from a Gaussian distribution with
$$\mu=B+I_{nn}, \quad \sigma=\sqrt{C^2+B+I_{nn}}$$
File: gnuastro.info, Node: Generating random numbers, Prev: Final noised pixel value, Up: Noise basics
6.2.3.4 Generating random numbers
.................................
As discussed above, to generate noise we need to make random samples of
a particular distribution. So it is important to understand some
general concepts regarding the generation of random numbers. For a very
complete and nice introduction we strongly advise reading Donald Knuth's
"The art of computer programming", volume 2, chapter 3(1). Quoting from
the GNU Scientific Library manual, "If you do not own it, you should
stop reading right now, run to the nearest bookstore, and buy it"(2)!
Using only software, we can only produce what is called a
psuedo-random sequence of numbers. A true random number generator is a
hardware (let's assume we have made sure it has no systematic biases),
for example, throwing dice or flipping coins (which have remained from
the ancient times). More modern hardware methods use atmospheric noise,
thermal noise or other types of external electromagnetic or quantum
phenomena. All pseudo-random number generators (software) require a
seed to be the basis of the generation. The advantage of having a seed
is that if you specify the same seed for multiple runs, you will get an
identical sequence of random numbers which allows you to reproduce the
same final noised image.
The programs in GNU Astronomy Utilities (for example, MakeNoise or
MakeProfiles) use the GNU Scientific Library (GSL) to generate random
numbers. GSL allows the user to set the random number generator through
environment variables, see *note Installation directory:: for an
introduction to environment variables. In the chapter titled "Random
Number Generation" they have fully explained the various random number
generators that are available (there are a lot of them!). Through the
two environment variables ‘GSL_RNG_TYPE’ and ‘GSL_RNG_SEED’ you can
specify the generator and its seed respectively.
If you do not specify a value for ‘GSL_RNG_TYPE’, GSL will use its
default random number generator type. The default type is sufficient
for most general applications. If no value is given for the
‘GSL_RNG_SEED’ environment variable and you have asked Gnuastro to read
the seed from the environment (through the ‘--envseed’ option), then GSL
will use the default value of each generator to give identical outputs.
If you do not explicitly tell Gnuastro programs to read the seed value
from the environment variable, then they will use the system time
(accurate to within a microsecond) to generate (apparently random)
seeds. In this manner, every time you run the program, you will get a
different random number distribution.
There are two ways you can specify values for these environment
variables. You can call them on the same command-line for example:
$ GSL_RNG_TYPE="taus" GSL_RNG_SEED=345 astarithmetic input.fits \
mknoise-sigma \
--envseed
In this manner the values will only be used for this particular
execution of Arithmetic. However, it makes your code hard to read!
Alternatively, you can define them for the full period of your terminal
session or script, using the shell's ‘export’ command with the two
separate commands below (for a script remove the ‘$’ signs):
$ export GSL_RNG_TYPE="taus"
$ export GSL_RNG_SEED=345
The subsequent programs which use GSL's random number generators will
hence forth use these values in this session of the terminal you are
running or while executing this script. In case you want to set fixed
values for these parameters every time you use the GSL random number
generator, you can add these two lines to your ‘.bashrc’ startup
script(3), see *note Installation directory::.
*IMPORTANT NOTE:* If the two environment variables ‘GSL_RNG_TYPE’ and
‘GSL_RNG_SEED’ are defined, GSL will report them by default, even if you
do not use the ‘--envseed’ option. For example, see this call to
MakeProfiles:
$ export GSL_RNG_TYPE=taus
$ export GSL_RNG_SEED=345
$ astmkprof -s1 --kernel=gaussian,2,5
GSL_RNG_TYPE=taus
GSL_RNG_SEED=345
MakeProfiles V.VV started on DDD MMM DDD HH:MM:SS YYYY
- Building one gaussian kernel
- Random number generator (RNG) type: taus
- Basic RNG seed: 1618960836
---- ./kernel.fits created.
-- Output: ./kernel.fits
MakeProfiles finished in 0.068945 seconds
The first two output lines (showing the names and values of the GSL
environment variables) are printed by GSL before MakeProfiles actually
starts generating random numbers. Gnuastro's programs will report the
actual values they use independently (after the name of the program),
you should check them for the final values used, not GSL's printed
values. In the example above, did you notice how the random number
generator seed above is different between GSL and MakeProfiles?
However, if ‘--envseed’ was given, both printed seeds would be the same.
---------- Footnotes ----------
(1) Knuth, Donald. 1998. The art of computer programming.
Addison-Wesley. ISBN 0-201-89684-2
(2) For students, running to the library might be more affordable!
(3) Do not forget that if you are going to give your scripts (that
use the GSL random number generator) to others you have to make sure you
also tell them to set these environment variable separately. So for
scripts, it is best to keep all such variable definitions within the
script, even if they are within your ‘.bashrc’.
File: gnuastro.info, Node: Arithmetic operators, Next: Invoking astarithmetic, Prev: Noise basics, Up: Arithmetic
6.2.4 Arithmetic operators
--------------------------
In this section, list of recognized operators in Arithmetic (and the
Table program's *note Column arithmetic::) and discussed in detail with
examples. As mentioned before, to be able to easily do complex
operations on the command-line, the Reverse Polish Notation is used
(where you write '$4\quad5\quad+$' instead of '$4 + 5$'), if you are not
already familiar with it, before continuing, please see *note Reverse
polish notation::.
The operands to all operators can be a data array (for example, a
FITS image or data cube) or a number, the output will be an array or
number according to the inputs. For example, a number multiplied by an
array will produce an array. The numerical data type of the output of
each operator is described within it. Here are some generic tips and
tricks (relevant to all operators):
Multiple operators in one command
When you need to use arithmetic commands in several consecutive
operations, you can use one command instead of multiple commands
and perform all calculations in the same command. For example,
assume you want to apply a threshold of 10 on your image, and label
the connected groups of pixel above this threshold. You need two
operators for this: ‘gt’ (for "greater than", see *note Conditional
operators::) and ‘connected-components’ (see *note Mathematical
morphology operators::). The bad (non-optimized and slow) way of
doing this is to call Arithmetic two times:
$ astarithmetic image.fits 10 gt --output=thresh.fits
$ astarithmetic thresh.fits 2 connected-components \
--output=labeled.fits
$ rm thresh.fits
The good (optimal) way is to call them after each other (remember
*note Reverse polish notation::):
$ astarithmetic image.fits 10 gt 2 connected-components \
--output=labeled.fits
You can similarly add any number of operations that must be done
sequentially in a single command and benefit from the speed and
lack of intermediate files. When your commands become long, you
can use the ‘set-AAA’ operator to make it more readable, see *note
Operand storage in memory or a file::.
Blank pixels in Arithmetic
Blank pixels in the image (see *note Blank pixels::) will be stored
based on the data type. When the input is floating point type,
blank values are NaN. One aspect of NaN values is that by
definition they will fail on _any_ comparison. Also, any operator
that includes a NaN as a an operand will produce a NaN
(irrespective of its other operands). Hence both equal and
not-equal operators will fail when both their operands are NaN!
Therefore, the only way to guarantee selection of blank pixels is
through the ‘isblank’ operator explained above.
One way you can exploit this property of the NaN value to your
advantage is when you want a fully zero-valued image (even over the
blank pixels) based on an already existing image (with same size
and world coordinate system settings). The following command will
produce this for you:
$ astarithmetic input.fits nan eq --output=all-zeros.fits
Note that on the command-line you can write NaN in any case (for
example, ‘NaN’, or ‘NAN’ are also acceptable). Reading NaN as a
floating point number in Gnuastro is not case-sensitive.
* Menu:
* Basic mathematical operators:: For example, +, -, /, log, and pow.
* Trigonometric and hyperbolic operators:: sin, cos, atan, asinh, etc.
* Constants:: Physical and Mathematical constants.
* Coordinate conversion operators:: For example equatorial J2000 to Galactic.
* Unit conversion operators:: Various unit conversions necessary.
* Statistical operators:: Statistics of a single dataset (for example, mean).
* Stacking operators:: Coadding or combining multiple datasets into one.
* Filtering operators:: Smoothing a dataset through mixing pixel with neighbors.
* Pooling operators:: Reducing size through statistics of pixels in window.
* Interpolation operators:: Giving blank pixels a value.
* Dimensionality changing operators:: Collapse or expand a dataset.
* Conditional operators:: Select certain pixels within the dataset.
* Mathematical morphology operators:: Work on binary images, for example, erode.
* Bitwise operators:: Work on bits within one pixel.
* Numerical type conversion operators:: Convert the numeric datatype of a dataset.
* Random number generators:: Random numbers can be used to add noise for example.
* Coordinate and border operators:: Return edges of 2D boxes.
* Loading external columns:: Read a column from a table into the stack.
* Size and position operators:: Extracting image size and pixel positions.
* Building new dataset and stack management:: How to construct an empty dataset from scratch.
* Operand storage in memory or a file:: Tools for complex operations in one command.
File: gnuastro.info, Node: Basic mathematical operators, Next: Trigonometric and hyperbolic operators, Prev: Arithmetic operators, Up: Arithmetic operators
6.2.4.1 Basic mathematical operators
....................................
These are some of the most common operations you will be doing on your
data and include, so no further explanation is necessary. If you are
new to Gnuastro, just read the description of each carefully.
‘+’
Addition, so "‘4 5 +’" is equivalent to $4+5$. For example, in the
command below, the value 20000 is added to each pixel's value in
‘image.fits’:
$ astarithmetic 20000 image.fits +
You can also use this operator to sum the values of one pixel in
two images (which have to be the same size). For example, in the
commands below (which are identical, see paragraph after the
commands), each pixel of ‘sum.fits’ is the sum of the same pixel's
values in ‘a.fits’ and ‘b.fits’.
$ astarithmetic a.fits b.fits + -h1 -h1 --output=sum.fits
$ astarithmetic a.fits b.fits + -g1 --output=sum.fits
The HDU/extension has to be specified for each image with ‘-h’.
However, if the HDUs are the same in all inputs, you can use ‘-g’
to only specify the HDU once
If you need to add more than one dataset, one way is to use this
operator multiple times, for example, see the two commands below
that are identical in the Reverse Polish Notation (*note Reverse
polish notation::):
$ astarithmetic a.fits b.fits + c.fits + -osum.fits
$ astarithmetic a.fits b.fits c.fits + + -osum.fits
However, this can get annoying/buggy if you have more than three or
four images, in that case, a better way to sum data is to use the
‘sum’ operator (which also ignores blank pixels), that is discussed
in *note Stacking operators::.
*NaN values:* if a single argument of ‘+’ has a NaN value, the
output will also be NaN. To ignore NaN values, use the ‘sum’
operator of *note Stacking operators::. You can see the difference
with the two commands below:
$ astarithmetic --quiet 1.0 2.0 3.0 nan + + +
nan
$ astarithmetic --quiet 1.0 2.0 3.0 nan 4 sum
6.000000e+00
The same goes for all the *note Stacking operators:: so if your
data may include NaN pixels, be sure to use the stacking operators.
‘-’
Subtraction, so "‘4 5 -’" is equivalent to $4-5$. Usage of this
operator is similar to ‘+’ operator, for example:
$ astarithmetic 20000 image.fits -
$ astarithmetic a.fits b.fits - -g1 --output=sub.fits
‘x’
Multiplication, so "‘4 5 x’" is equivalent to $4\times5$. For
example, in the command below, the value of each output pixel is 5
times its value in ‘image.fits’:
$ astarithmetic image.fits 5 x
And you can multiply the value of each pixel in two images, like
this:
$ astarithmetic a.fits a.fits x -g1 –output=multip.fits
‘/’
Division, so "‘4 5 /’" is equivalent to $4/5$. Like the
multiplication, for example
$ astarithmetic image.fits 5 -h1 /
$ astarithmetic a.fits b.fits / -g1 –output=div.fits
‘%’
Modulo (remainder), so "‘3 2 %’" will return $1$. Note that the
modulo operator only works on integer types (see *note Numeric data
types::). This operator is therefore not defined for most
processed astronomical astronomical images that have floating-point
value. However it is useful in labeled images, for example, *note
Segment output::). In such cases, each pixel is the integer label
of the object it is associated with hence with the example command
below, we can change the labels to only be between 1 and 4 and
decrease all objects on the image to 4/5th (all objects with a
label that is a multiple of 5 will be set to 0).
$ astarithmetic label.fits 5 1 %
‘abs’
Absolute value of first operand, so "‘4 abs’" is equivalent to
$|4|$. For example, the output of the command bellow will not have
any negative pixels (all negative pixels will be multiplied by $-1$
to become positive)
$ astarithmetic image.fits abs
‘pow’
First operand to the power of the second, so "‘4.3 5 pow’" is
equivalent to $4.3^{5}$. For example, with the command below all
pixels will be squared
$ astarithmetic image.fits 2 pow
‘sqrt’
The square root of the first operand, so "‘5 sqrt’" is equivalent
to $\sqrt{5}$. Since the square root is only defined for positive
values, any negative-valued pixel will become NaN (blank). The
output will have a floating point type, but its precision is
determined from the input: if the input is a 64-bit floating point,
the output will also be 64-bit. Otherwise, the output will be
32-bit floating point (see *note Numeric data types:: for the
respective precision). Therefore if you require 64-bit precision
in estimating the square root, convert the input to 64-bit floating
point first, for example, with ‘5 float64 sqrt’. For example, each
pixel of the output of the command below will be the square root of
that pixel in the input.
$ astarithmetic image.fits sqrt
If you just want to scale an image with negative values using this
operator (for better visual inspection, and the actual values do
not matter for you), you can subtract the image from its minimum
value, then take its square root:
$ astarithmetic image.fits image.fits minvalue - sqrt -g1
Alternatively, to avoid reading the image into memory two times,
you can use the ‘set-’ operator to read it into the variable ‘i’
and use ‘i’ two times to speed up the operation (described below):
$ astarithmetic image.fits set-i i i minvalue - sqrt
‘log’
Natural logarithm of first operand, so "‘4 log’" is equivalent to
$ln(4)$. Negative pixels will become NaN, and the output type is
determined from the input, see the explanation under ‘sqrt’ for
more on these features. For example, the command below will take
the natural logarithm of every pixel in the input.
$ astarithmetic image.fits log --output=log.fits
‘log10’
Base-10 logarithm of first popped operand, so "‘4 log’" is
equivalent to $log_{10}(4)$. Negative pixels will become NaN, and
the output type is determined from the input, see the explanation
under ‘sqrt’ for more on these features. For example, the command
below will take the base-10 logarithm of every pixel in the input.
$ astarithmetic image.fits log10
File: gnuastro.info, Node: Trigonometric and hyperbolic operators, Next: Constants, Prev: Basic mathematical operators, Up: Arithmetic operators
6.2.4.2 Trigonometric and hyperbolic operators
..............................................
All the trigonometric and hyperbolic functions are described here. One
good thing with these operators is that they take inputs and outputs in
degrees (which we usually need as input or output), not radians (like
most other programs/libraries).
‘sin’
‘cos’
‘tan’
Basic trigonometric functions. They take one operand, in units of
degrees.
‘asin’
‘acos’
‘atan’
Inverse trigonometric functions. They take one operand and the
returned values are in units of degrees.
‘atan2’
Inverse tangent (output in units of degrees) that uses the signs of
the input coordinates to distinguish between the quadrants. This
operator therefore needs two operands: the first popped operand is
assumed to be the X axis position of the point, and the second
popped operand is its Y axis coordinate.
For example, see the commands below. To be more clear, we are
using Table's *note Column arithmetic:: which uses exactly the same
internal library function as the Arithmetic program for images. We
are showing the results for four points in the four quadrants of
the 2D space (if you want to try running them, you do not need to
type/copy the parts after <#>). The first point (2,2) is in the
first quadrant, therefore the returned angle is 45 degrees. But
the second, third and fourth points are in the quadrants of the
same order, and the returned angles reflect the quadrant.
$ echo " 2 2" | asttable -c'arith $2 $1 atan2' # --> 45
$ echo " 2 -2" | asttable -c'arith $2 $1 atan2' # --> -45
$ echo "-2 -2" | asttable -c'arith $2 $1 atan2' # --> -135
$ echo "-2 2" | asttable -c'arith $2 $1 atan2' # --> 135
However, if you simply use the classic arc-tangent operator
(‘atan’) for the same points, the result will only be in two
quadrants as you see below:
$ echo " 2 2" | asttable -c'arith $2 $1 / atan' # --> 45
$ echo " 2 -2" | asttable -c'arith $2 $1 / atan' # --> -45
$ echo "-2 -2" | asttable -c'arith $2 $1 / atan' # --> 45
$ echo "-2 2" | asttable -c'arith $2 $1 / atan' # --> -45
‘sinh’
‘cosh’
‘tanh’
Hyperbolic sine, cosine, and tangent. These operators take a
single operand.
‘asinh’
‘acosh’
‘atanh’
Inverse Hyperbolic sine, cosine, and tangent. These operators take
a single operand.
File: gnuastro.info, Node: Constants, Next: Coordinate conversion operators, Prev: Trigonometric and hyperbolic operators, Up: Arithmetic operators
6.2.4.3 Constants
.................
During your analysis it is often necessary to have certain constants
like the number $\pi$. The "operators" in this section do not actually
take any operand, they just replace the desired constant into the stack.
So in effect, these are actually operands. But since their value is not
inserted by the user, we have placed them in the list of operators.
‘e’
Euler’s number, or the base of the natural logarithm (no units).
See Wikipedia
(https://en.wikipedia.org/wiki/E_(mathematical_constant)).
‘pi’
Ratio of circle’s circumference to its diameter (no units). See
Wikipedia (https://en.wikipedia.org/wiki/Pi).
‘c’
The speed of light in vacuum, in units of $m/s$. see Wikipedia
(https://en.wikipedia.org/wiki/Speed_of_light).
‘G’
The gravitational constant, in units of $m^3/kg/s^2$. See
Wikipedia (https://en.wikipedia.org/wiki/Gravitational_constant).
‘h’
Plank's constant, in units of $J/Hz$ or $kg\times m^2/s$. See
Wikipedia (https://en.wikipedia.org/wiki/Planck_constant).
‘au’
Astronomical Unit, in units of meters. See Wikipedia
(https://en.wikipedia.org/wiki/Astronomical_unit).
‘ly’
Distance covered by light in vacuum in one year, in units of
meters. See Wikipedia (https://en.wikipedia.org/wiki/Light-year).
‘avogadro’
Avogadro's constant, in units of $1/mol$. See Wikipedia
(https://en.wikipedia.org/wiki/Avogadro_constant).
‘fine-structure’
The fine-structure constant (no units). See Wikipedia
(https://en.wikipedia.org/wiki/Fine-structure_constant).
File: gnuastro.info, Node: Coordinate conversion operators, Next: Unit conversion operators, Prev: Constants, Up: Arithmetic operators
6.2.4.4 Coordinate conversion operators
.......................................
Different celestial coordinate systems are useful for different
scenarios. For example, assume you have the RA and Dec of large sample
of galaxies that you plan to study the halos of galaxies from. For such
studies, you prefer to stay as far away as possible from the Galactic
plane, because the density of stars and interstellar filaments (cirrus)
significantly increases as you get close to the Milky way's disk. But
the Equatorial coordinate system
(https://en.wikipedia.org/wiki/Equatorial_coordinate_system) which
defines the RA and Dec and is based on Earth's equator; and does not
show the position of your objects in relation to the galactic disk.
The best way forward in the example above is to convert your RA and
Dec table into the Galactic coordinate system
(https://en.wikipedia.org/wiki/Galactic_coordinate_system); and select
those with a large (positive or negative) Galactic latitude.
Alternatively, if you observe a bright point on a galaxy and want to
confirm if it was actually a super-nova and not a moving asteroid, a
first step is to convert your RA and Dec to the Ecliptic coordinate
system (https://en.wikipedia.org/wiki/Ecliptic_coordinate_system) and
confirm if you are sufficiently distant from the ecliptic (plane of the
Solar System; where fast moving objects are most common).
The operators described in this section are precisely for the purpose
above: to convert various celestial coordinate systems that are
supported within Gnuastro into each other. For example, if you want to
convert the RA and Dec equatorial (at the Julian year 2000 equinox)
coordinates (within the ‘RA’ and ‘DEC’ columns) of ‘points.fits’ into
Galactic longitude and latitude, you can use the command below (the
column metadata are not mandatory, but to avoid later confusion, it is
always good to have them in your output.
$ asttable points.fits -c'arith RA DEC eq-j2000-to-galactic' \
--colmetadata=1,GLON,deg,"Galactic longitude" \
--colmetadata=2,GLAT,deg,"Galactic latitude" \
--output=points-gal.fits
One important thing to consider is that the equatorial and ecliptic
coordinates are not static: they include the dynamics of Earth in the
solar system: in particular, the reference point on the equator moves
over decades. Therefore these two (equatorial and ecliptic) coordinate
systems are defined within epochs: the 1950 epoch is defined by
Besselian years
(https://en.wikipedia.org/wiki/Epoch_(astronomy)#Besselian_years), while
the 2000 epoch is defined in Julian years
(https://en.wikipedia.org/wiki/Epoch_(astronomy)#Julian_years_and_J2000).
So when dealing with these coordinates one of the '‘-b1950’' or
'‘-j2000’' suffixes are necessary (for example ‘eq-j2000’ or
‘ec-b1950’).
The Galactic or Supergalactic coordinates are not defined based on
the Earth's dynamics; therefore they do not have any epoch associated
with them. Extra-galactic studies do not depend on the dynamics of the
earth, but the equatorial coordinate system is the most dominant in that
field. Therefore in its 23rd General Assembly, the International
Astronomical Union approved the International Celestial Reference System
(https://en.wikipedia.org/wiki/International_Celestial_Reference_System_and_its_realizations)
or ICRS based on quasars (which are static within our observational
limitations)viewed through long baseline radio interferometry (the most
accurate method of observation that we currently have). ICRS is
designed to be within the errors of the Equatorial J2000 coordinate
system, so they are currently very similar; but ICRS has much better
accuracy. We will be adding ICRS in the operators below soon.
*Floating point errors:* The operation to convert between the
coordinate systems involves many sines, cosines (and their inverse).
Therefore, floating point errors (due to the limited precision of the
definition of floating points in bits) can cause small offsets. For
example see the code below were we convert equatorial to galactic and
back, then compare the input and output (which is in the 5th and 6th
decimal of a degree; or about 0.2 or 0.01 arcseconds).
$ sys1=eq-j2000
$ sys2=galactic
$ echo "10.2345689 45.6789012" \
| asttable -Afixed -B8 \
-c'arith $1 $2 '$sys1'-to-'$sys2' \
'$sys2'-to-'$sys1' set-lat set-lng \
lng $1 - lat $2 -'
0.00000363 -0.00007725
If you set ‘sys2=ec-j2000’ or ‘sys2=supergalactic’, it will be zero
until the full set of 8 decimals that are printed here (the displayed
precision can be changed with the value of the ‘-B’ option above). It
is therefore useful to have your original coordinates (in the same table
for example) and not do too many conversions on conversions (to
propagate this problem).
‘eq-b1950-to-eq-j2000’
‘eq-b1950-to-ec-b1950’
‘eq-b1950-to-ec-j2000’
‘eq-b1950-to-galactic’
‘eq-b1950-to-supergalactic’
Convert Equatorial (B1950 equinox) coordinates into the respective
coordinate system within each operator's name.
‘eq-j2000-to-eq-b1950’
‘eq-j2000-to-ec-b1950’
‘eq-j2000-to-ec-j2000’
‘eq-j2000-to-galactic’
‘eq-j2000-to-supergalactic’
Convert Equatorial (J2000 equinox) coordinates into the respective
coordinate system within each operator's name.
‘ec-b1950-to-eq-b1950’
‘ec-b1950-to-eq-j2000’
‘ec-b1950-to-ec-j2000’
‘ec-b1950-to-galactic’
‘ec-b1950-to-supergalactic’
Convert Ecliptic (B1950 equinox) coordinates into the respective
coordinate system within each operator's name.
‘ec-j2000-to-eq-b1950’
‘ec-j2000-to-eq-j2000’
‘ec-j2000-to-ec-b1950’
‘ec-j2000-to-galactic’
‘ec-j2000-to-supergalactic’
Convert Ecliptic (J2000 equinox) coordinates into the respective
coordinate system within each operator's name.
‘galactic-to-eq-b1950’
‘galactic-to-eq-j2000’
‘galactic-to-ec-b1950’
‘galactic-to-ec-j2000’
‘galactic-to-supergalactic’
Convert Galactic coordinates into the respective coordinate system
within each operator's name.
‘supergalactic-to-eq-b1950’
‘supergalactic-to-eq-j2000’
‘supergalactic-to-ec-b1950’
‘supergalactic-to-ec-j2000’
‘supergalactic-to-galactic’
Convert Supergalactic coordinates into the respective coordinate
system within each operator's name.
File: gnuastro.info, Node: Unit conversion operators, Next: Statistical operators, Prev: Coordinate conversion operators, Up: Arithmetic operators
6.2.4.5 Unit conversion operators
.................................
It often happens that you have data in one unit (for example, counts on
your CCD), but would like to convert it into another (for example,
magnitudes, to measure the brightness of a galaxy). While the equations
for the unit conversions can be easily found on the internet, the
operators in this section are designed to simplify the process and let
you do it easily and fast without having to remember constants and
relations.
‘counts-to-mag’
Convert counts (usually CCD outputs) to magnitudes using the given
zero point. The zero point is the first popped operand and the
count image or value is the second popped operand.
For example, assume you have measured the standard deviation of the
noise in an image to be ‘0.1’ counts, and the image's zero point is
‘22.5’ and you want to measure the _per-pixel_ surface brightness
limit of the dataset(1). To apply this operator on an image,
simply replace ‘0.1’ with the image name, as described below.
$ astarithmetic 0.1 22.5 counts-to-mag --quiet
Of course, you can also convert every pixel in an image (or table
column in Table's *note Column arithmetic::) with this operator if
you replace the second popped operand with an image/column name.
For an example of applying this operator on an image, see the
description of surface brightness in *note Brightness flux
magnitude::, where we will convert an image's pixel values to
surface brightness.
‘mag-to-counts’
Convert magnitudes to counts (usually CCD outputs) using the given
zero point. The zero point is the first popped operand and the
magnitude value is the second. For example, if an object has a
magnitude of 20, you can estimate the counts corresponding to it
(when the image has a zero point of 24.8) with this command: Note
that because the output is a single number, we are using ‘--quiet’
to avoid printing extra information.
$ astarithmetic 20 24.8 mag-to-counts --quiet
‘counts-to-sb’
Convert counts to surface brightness using the zero point and area
(in units of arcsec$^2$). The first popped operand is the area (in
arcsec$^2$), the second popped operand is the zero point and the
third are the count values. Estimating the surface brightness
involves taking the logarithm. Therefore this operator will
produce NaN for counts with a negative value.
For example, with the commands below, we read the zero point from
the image headers (assuming it is in the ‘ZPOINT’ keyword), we
calculate the pixel area from the image itself, and we call this
operator to convert the image pixels (in counts) to surface
brightness (mag/arcsec$^2$).
$ zeropoint=$(astfits image.fits --keyvalue=ZPOINT -q)
$ pixarea=$(astfits image.fits --pixelareaarcsec2)
$ astarithmetic image.fits $zeropoint $pixarea counts-to-sb \
--output=image-sb.fits
For more on the definition of surface brightness see *note
Brightness flux magnitude::, and for a fully tutorial on optimal
usage of this, see *note FITS images in a publication::.
‘sb-to-counts’
Convert surface brightness using the zero point and area (in units
of arcsec$^2$) to counts. The first popped operand is the area (in
arcsec$^2$), the second popped operand is the zero point and the
third are the surface brightness values. See the description of
‘counts-to-sb’ for more.
‘mag-to-sb’
Convert magnitudes to surface brightness over a certain area (in
units of arcsec$^2$). The first popped operand is the area and the
second is the magnitude. For example, let's assume you have a
table with the two columns of magnitude (called ‘MAG’) and area
(called ‘AREAARCSEC2’). In the command below, we will use *note
Column arithmetic:: to return the surface brightness.
$ asttable table.fits -c'arith MAG AREAARCSEC2 mag-to-sb'
‘sb-to-mag’
Convert surface brightness to magnitudes over a certain area (in
units of arcsec$^2$). The first popped operand is the area and the
second is the magnitude. See the description of ‘mag-to-sb’ for
more.
‘counts-to-jy’
Convert counts (usually CCD outputs) to Janskys through an
AB-magnitude based zero point. The top-popped operand is assumed
to be the AB-magnitude zero point and the second-popped operand is
assumed to be a dataset in units of counts (an image in Arithmetic,
and a column in Table's *note Column arithmetic::). For the full
equation and basic definitions, see *note Brightness flux
magnitude::.
For example, SDSS images are calibrated in units of nanomaggies,
with a fixed zero point magnitude of 22.5. Therefore you can
convert the units of SDSS image pixels to Janskys with the command
below:
$ astarithmetic sdss-image.fits 22.5 counts-to-jy
‘jy-to-counts’
Convert Janskys to counts (usually CCD outputs) through an
AB-magnitude based zero point. This is the inverse operation of
the ‘counts-to-jy’, see there for usage example.
‘counts-to-nanomaggy’
Convert counts to Nanomaggy (with fixed zero point of 22.5, used as
the pixel units of many surveys like SDSS). For example if your
image has a zero point of 24.93, you can convert it to Nanomaggies
with the command below:
$ astarithmetic image.fits 24.93 counts-to-nanomaggy
‘nanomaggy-to-counts’
Convert Nanomaggy to counts. Nanomaggy is defined to have a fixed
zero point of 22.5 and is the pixel units of many surveys like
SDSS. For example if you would like to convert an image in units of
Nanomaggy (for example from SDSS) to the counts of a camera with a
zero point of 25.92, you can use the command below:
$ astarithmetic image.fits 25.92 nanomaggy-to-counts
‘mag-to-jy’
Convert AB magnitudes to Janskys, see *note Brightness flux
magnitude::.
‘jy-to-mag’
Convert Janskys to AB magnitude, see *note Brightness flux
magnitude::.
‘au-to-pc’
Convert Astronomical Units (AUs) to Parsecs (PCs). This operator
takes a single argument which is interpreted to be the input AUs.
The conversion is based on the definition of Parsecs: $1 \rm{PC} =
1/tan(1^{\prime\prime}) \rm{AU}$, where $1^{\prime\prime}$ is one
arcseconds. In other words, $1 (\rm{PC}) = 648000/\pi (\rm{AU})$.
For example, if we take Pluto's average distance to the Sun to be
40 AUs, we can obtain its distance in Parsecs using this command:
echo 40 | asttable -c'arith $1 au-to-pc'
‘pc-to-au’
Convert Parsecs (PCs) to Astronomical Units (AUs). This operator
takes a single argument which is interpreted to be the input PCs.
For more on the conversion equation, see description of ‘au-to-pc’.
For example, Proxima Centauri (the nearest star to the Solar
system) is 1.3020 Parsecs from the Sun, we can calculate this
distance in units of AUs with the command below:
echo 1.3020 | asttable -c'arith $1 pc-to-au'
‘ly-to-pc’
Convert Light-years (LY) to Parsecs (PCs). This operator takes a
single argument which is interpreted to be the input LYs. The
conversion is done from IAU's definition of the light-year
(9460730472580800 m $\approx$ 63241.077 AU = 0.306601 PC, for the
conversion of AU to PC, see the description of ‘au-to-pc’).
For example, the distance of Andromeda galaxy to our galaxy is 2.5
million light-years, so its distance in kilo-Parsecs can be
calculated with the command below (note that we want the output in
kilo-parsecs, so we are dividing the output of this operator by
1000):
echo 2.5e6 | asttable -c'arith $1 ly-to-pc 1000 /'
‘pc-to-ly’
Convert Parsecs (PCs) to Light-years (LY). This operator takes a
single argument which is interpreted to be the input PCs. For the
conversion and an example of the inverse of this operator, see the
description of ‘ly-to-pc’.
‘ly-to-au’
Convert Light-years (LY) to Astronomical Units (AUs). This
operator takes a single argument which is interpreted to be the
input LYs. For the conversion and a similar example, see the
description of ‘ly-to-pc’.
‘au-to-ly’
Convert Astronomical Units (AUs) to Light-years (LY). This operator
takes a single argument which is interpreted to be the input AUs.
For the conversion and a similar example, see the description of
‘ly-to-pc’.
---------- Footnotes ----------
(1) The _per-pixel_ surface brightness limit is the magnitude of the
noise standard deviation. For more on surface brightness see *note
Brightness flux magnitude::. In the example command, because the output
is a single number, we are using ‘--quiet’ to avoid printing extra
information.
File: gnuastro.info, Node: Statistical operators, Next: Stacking operators, Prev: Unit conversion operators, Up: Arithmetic operators
6.2.4.6 Statistical operators
.............................
The operators in this section take a single dataset as input, and will
return the desired statistic as a single value.
‘minvalue’
Minimum value in the first popped operand, so "‘a.fits minvalue’"
will push the minimum pixel value in this image onto the stack.
When this operator acts on a single image, the output (operand that
is put back on the stack) will no longer be an image, but a number.
The output of this operand is in the same type as the input. This
operator is mainly intended for multi-element datasets (for
example, images or data cubes), if the popped operand is a number,
it will just return it without any change.
Note that when the final remaining/output operand is a single
number, it is printed onto the standard output. For example, with
the command below the minimum pixel value in ‘image.fits’ will be
printed in the terminal:
$ astarithmetic image.fits minvalue
However, the output above also includes a lot of extra information
that are not relevant in this context. If you just want the final
number, run Arithmetic in quiet mode:
$ astarithmetic image.fits minvalue -q
Also see the description of ‘sqrt’ for other example usages of this
operator.
‘maxvalue’
Maximum value of first operand in the same type, similar to
‘minvalue’, see the description there for more. For example
$ astarithmetic image.fits maxvalue -q
‘numbervalue’
Number of non-blank elements in first operand in the ‘uint64’ type
(since it is always a positive integer, see *note Numeric data
types::). Its usage is similar to ‘minvalue’, for example
$ astarithmetic image.fits numbervalue -q
‘sumvalue’
Sum of non-blank elements in first operand in the ‘float32’ type.
Its usage is similar to ‘minvalue’, for example
$ astarithmetic image.fits sumvalue -q
‘meanvalue’
Mean value of non-blank elements in first operand in the ‘float32’
type. Its usage is similar to ‘minvalue’, for example
$ astarithmetic image.fits meanvalue -q
‘stdvalue’
Standard deviation of non-blank elements in first operand in the
‘float32’ type. Its usage is similar to ‘minvalue’, for example
$ astarithmetic image.fits stdvalue -q
‘medianvalue’
Median of non-blank elements in first operand with the same type.
Its usage is similar to ‘minvalue’, for example
$ astarithmetic image.fits medianvalue -q
‘unique’
Remove all duplicate (and blank) elements from the first popped
operand. The unique elements of the dataset will be stored in a
single-dimensional dataset.
Recall that by default, single-dimensional datasets are stored as a
table column in the output. But you can use ‘--onedasimage’ or
‘--onedonstdout’ to respectively store them as a single-dimensional
FITS array/image, or to print them on the standard output.
Although you can use this operator on the floating point dataset,
due to floating-point errors it may give non-reasonable values:
because the tenth digit of the decimal point is also considered
although it may be statistically meaningless, see *note Numeric
data types::. It is therefore better/recommended to use it on the
integer dataset like the labeled images of *note Segment output::
where each pixel has the integer label of the object/clump it is
associated with. For example, let's assume you have cropped a
region of a larger labeled image and want to find the
labels/objects that are within the crop. With this operator, this
job is trivial:
$ astarithmetic seg-crop.fits unique
‘noblank’
Remove all blank elements from the first popped operand. Since the
blank pixels are being removed, the output dataset will always be
single-dimensional, independent of the dimensionality of the input.
Recall that by default, single-dimensional datasets are stored as a
table column in the output. But you can use ‘--onedasimage’ or
‘--onedonstdout’ to respectively store them as a single-dimensional
FITS array/image, or to print them on the standard output.
For example, with the command below, the non-blank pixel values of
‘cropped.fits’ are printed on the command-line (the ‘--quiet’
option is used to remove the extra information that Arithmetic
prints as it reads the inputs, its version and its running time).
$ astarithmetic cropped.fits noblank --onedonstdout --quiet
File: gnuastro.info, Node: Stacking operators, Next: Filtering operators, Prev: Statistical operators, Up: Arithmetic operators
6.2.4.7 Stacking operators
..........................
The operators in this section are used when you have multiple datasets
that you would like to merge into one, commonly known as "stacking" or
"coaddition". For example, you have taken ten exposures of your
scientific target, and you would like to combine them all into one deep
stacked image that is deeper.
When calling these operators you should determine how many operands
they should take in (unlike the rest of the operators that have a fixed
number of input operands). As described in the first operand below, you
do this through their first popped operand (which should be a single
integer number that is larger than one). Below are Some important
points for all the stacking operators described in this section:
• NaN/blank pixels will be ignored, see *note Blank pixels::.
• The output will have the same type as the inputs. This is natural
for the ‘min’ and ‘max’ operators, but for other similar operators
(for example, ‘sum’, or ‘average’) the per-pixel operations will be
done in double precision floating point and then stored back in the
input type. Therefore, if the input was an integer, C's internal
type conversion will be used.
• The operation will be multi-threaded, greatly speeding up the
process if you have large and numerous data to stack. You can
disable multi-threaded operations with the ‘--numthreads=1’ option
(see *note Multi-threaded operations::).
‘min’
‘max’
‘sum’
‘std’
‘mad’
‘mean’
‘median’
‘number’
For each pixel, calculate the respective statistic from in all
given datasets. For the ‘min’ and ‘max’ operators, the output will
have the same type as the input. For the ‘number’ operator, the
output will have an unsigned 32-bit integer type and the rest will
be 32-bit floating point.
The first popped operand to this operator must be a positive
integer number which specifies how many further operands should be
popped from the stack. All the subsequently popped operands must
have the same type and size. This operator (and all the
variable-operand operators similar to it that are discussed below)
will work in multi-threaded mode unless Arithmetic is called with
the ‘--numthreads=1’ option, see *note Multi-threaded operations::.
For example, the following command will produce an image with the
same size and type as the three inputs, but each output pixel value
will be the minimum of the same pixel's values in all three input
images.
$ astarithmetic a.fits b.fits c.fits 3 min --output=min.fits
Regarding the ‘number’ operator: some datasets may have blank
values (which are also ignored in all similar operators like ‘min’,
‘sum’, ‘mean’ or ‘median’). Hence, the final pixel values of this
operator will not, in general, be equal to the number of inputs.
This operator is therefore mostly called in parallel with those
operators to know the "weight" of each pixel (in case you want to
only keep pixels that had the full exposure for example).
‘quantile’
For each pixel, find the quantile from all given datasets. The
output will have the same numeric data type and size as the input
datasets. Besides the input datasets, the quantile operator also
needs a single parameter (the requested quantile). The parameter
should be the first popped operand, with a value between (and
including) 0 and 1. The second popped operand must be the number
of datasets to use.
In the example below, the first-popped operand (‘0.7’) is the
quantile, the second-popped operand (‘3’) is the number of datasets
to pop.
astarithmetic a.fits b.fits c.fits 3 0.7 quantile
‘madclip-fill-mad’
‘madclip-fill-std’
‘madclip-fill-mean’
‘madclip-fill-median’
‘madclip-fill-number’
Find the respective statistic after median absolute deviation (MAD)
filled re-clipping of the values of the same element (pixel in an
image) of all the inputs. For a complete tutorial on clipping
outliers see *note Clipping outliers:: (if you haven't read it yet,
we encourage you to read through it before continuing). For the
particular case of filled re-clipping (the ‘madclip-fill-*’
operators here) see *note Filled re-clipping::. Spoiler alert:
this is currently the most robust stacking operator in Gnuastro.
The output data type of all these operators is a 32-bit floating
point number, except for ‘madclip-fill-number’, where an unsigned
32-bit integer is returned (see *note Numeric data types::). This
operator will combine the given inputs into a single output of the
same dimension as one of the inputs. Each pixel of the output
contains the requested statistic from the remaining values after
filled MAD re-clipping. This operator is very similar to ‘min’,
with the exception that it expects two extra operands (parameters
for MAD-clipping) before the total number of inputs. The first
popped operand is the termination criteria and the second is the
multiple of the median absolute deviation.
For example, in the command below, the first popped operand
(‘0.01’) is the MAD-clipping termination criteria. If the
termination criteria is larger than, or equal to, 1 it is
interpreted as a pre-defined the number of clips. But if it is
between 0 and 1, then it is the tolerance level on the change in
the median absolute deviation (see *note MAD clipping::). The
second popped operand (‘5’) is the multiple of the median absolute
deviation to use in MAD-clipping. The third popped operand (‘3’)
is number of datasets that will be used (similar to the first
popped operand to ‘min’).
$ astarithmetic a.fits b.fits c.fits 3 5 0.01 madclip-fill-std
‘madclip-mad’
‘madclip-std’
‘madclip-mean’
‘madclip-median’
‘madclip-number’
Find the number of useful values after median absolute deviation
(MAD) clipping of the values of the same element (pixel in an
image) of all the inputs. These operators are called in an
identical fashion to the ‘madclip-fill-*’ operators described
above; see the description there for more.
‘sigclip-fill-number’
Find the number of useful values after filled sigma ($\sigma$,
which stands for the standard deviation) re-clipping of the values
of the same element (pixel in an image) of all the inputs. For a
complete tutorial on clipping outliers see *note Clipping
outliers:: (if you haven't read it yet, we encourage you to read
through it before continuing). For the particular case of filled
re-clipping (the ‘sigclip-fill-*’ operators here) see *note Filled
re-clipping::. Spoiler alert: MAD filled re-clipping is currently
most robust stacking operator in Gnuastro (the ‘madclip-fill-*’
operators).
This operator will combine the specified number of inputs into a
single output that contains the number of remaining elements after
$\sigma$-clipping on each element/pixel (for more on
$\sigma$-clipping, see *note Sigma clipping::). This operator is
very similar to ‘min’, with the exception that it expects two
operands (parameters for $\sigma$-clipping) before the total number
of inputs. The first popped operand is the termination criteria
and the second is the multiple of $\sigma$.
For example, in the command below, the first popped operand (‘0.2’)
is the sigma clipping termination criteria. If the termination
criteria is larger than, or equal to, 1 it is interpreted as the
number of clips to do. But if it is between 0 and 1, then it is
the tolerance level on the standard deviation (see *note Sigma
clipping::). The second popped operand (‘5’) is the multiple of
sigma to use in $\sigma$-clipping. The third popped operand (‘10’)
is number of datasets that will be used (similar to the first
popped operand to ‘min’).
astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-fill-number
‘sigclip-fill-median’
For each pixel, find the $\sigma$-clipped median in all given
datasets. The output will have the a single-precision (32-bit)
floating point type. This operator is called similar to the
‘sigclip-fill-number’ operator, please see there for more. For
example
astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-fill-median
‘sigclip-fill-mean’
For each pixel, find the $\sigma$-clipped mean in all given
datasets. The output will have the a single-precision (32-bit)
floating point type. This operator is called similar to the
‘sigclip-fill-number’ operator, please see there for more. For
example
astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-fill-mean
‘sigclip-fill-std’
For each pixel, find the $\sigma$-clipped standard deviation in all
given datasets. The output will have the a single-precision
(32-bit) floating point type. This operator is called similar to
the ‘sigclip-fill-number’ operator, please see there for more. For
example
astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-fill-std
‘sigclip-fill-mad’
For each pixel, find the $\sigma$-clipped median absolute deviation
(MAD) in all given datasets. The output will have the a
single-precision (32-bit) floating point type. This operator is
called similar to the ‘sigclip-fill-number’ operator, please see
there for more. For example
astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-fill-mad
‘sigclip-number’
Find the number of useful values after sigma ($\sigma$, which
stands for the standard deviation) clipping of the values of the
same element (pixel in an image) of all the inputs. For a complete
tutorial on clipping outliers see *note Clipping outliers:: (if you
haven't read it yet, we encourage you to read through it before
continuing). For the particular case of $\sigma$-clipping (the
‘sigclip-*’ operators here) see *note Sigma clipping::. Spoiler
alert: MAD filled re-clipping is currently most robust stacking
operator in Gnuastro (the ‘madclip-fill-*’ operators).
This operator will combine the specified number of inputs into a
single output that contains the number of remaining elements after
$\sigma$-clipping on each element/pixel (for more on
$\sigma$-clipping, see *note Sigma clipping::). This operator is
very similar to ‘min’, with the exception that it expects two
operands (parameters for $\sigma$-clipping) before the total number
of inputs. The first popped operand is the termination criteria
and the second is the multiple of $\sigma$.
For example, in the command below, the first popped operand (‘0.2’)
is the sigma clipping termination criteria. If the termination
criteria is larger than, or equal to, 1 it is interpreted as the
number of clips to do. But if it is between 0 and 1, then it is
the tolerance level on the standard deviation (see *note Sigma
clipping::). The second popped operand (‘5’) is the multiple of
sigma to use in $\sigma$-clipping. The third popped operand (‘10’)
is number of datasets that will be used (similar to the first
popped operand to ‘min’).
astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-number
‘sigclip-median’
For each pixel, find the $\sigma$-clipped median in all given
datasets. The output will have the a single-precision (32-bit)
floating point type. This operator is called similar to the
‘sigclip-number’ operator, please see there for more. For example
astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-median
‘sigclip-mean’
For each pixel, find the $\sigma$-clipped mean in all given
datasets. The output will have the a single-precision (32-bit)
floating point type. This operator is called similar to the
‘sigclip-number’ operator, please see there for more. For example
astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-mean
‘sigclip-std’
For each pixel, find the $\sigma$-clipped standard deviation in all
given datasets. The output will have the a single-precision
(32-bit) floating point type. This operator is called similar to
the ‘sigclip-number’ operator, please see there for more. For
example
astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-std
‘sigclip-mad’
For each pixel, find the $\sigma$-clipped median absolute deviation
(MAD) in all given datasets. The output will have the a
single-precision (32-bit) floating point type. This operator is
called similar to the ‘sigclip-number’ operator, please see there
for more. For example
astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-mad
File: gnuastro.info, Node: Filtering operators, Next: Pooling operators, Prev: Stacking operators, Up: Arithmetic operators
6.2.4.8 Filtering (smoothing) operators
.......................................
Image filtering is commonly used for smoothing: every pixel value in the
output image is created by applying a certain statistic to the pixels in
its vicinity.
‘filter-mean’
Apply mean filtering (or moving average
(https://en.wikipedia.org/wiki/Moving_average)) on the input
dataset. During mean filtering, each pixel (data element) is
replaced by the mean value of all its surrounding pixels (excluding
blank values). The number of surrounding pixels in each dimension
(to calculate the mean) is determined through the earlier operands
that have been pushed onto the stack prior to the input dataset.
The number of necessary operands is determined by the dimensions of
the input dataset (first popped operand). The order of the
dimensions on the command-line is the order in FITS format. Here
is one example:
$ astarithmetic 5 4 image.fits filter-mean
In this example, each pixel is replaced by the mean of a 5 by 4 box
around it. The box is 5 pixels along the first FITS dimension
(horizontal when viewed in ds9) and 4 pixels along the second FITS
dimension (vertical).
Each pixel will be placed in the center of the box that the mean is
calculated on. If the given width along a dimension is even, then
the center is assumed to be between the pixels (not in the center
of a pixel). When the pixel is close to the edge, the pixels of
the box that fall outside the image are ignored. Therefore, on the
edge, less points will be used in calculating the mean.
The final effect of mean filtering is to smooth the input image, it
is essentially a convolution with a kernel that has identical
values for all its pixels (is flat), see *note Convolution
process::.
Note that blank pixels will also be affected by this operator: if
there are any non-blank elements in the box surrounding a blank
pixel, in the filtered image, it will have the mean of the
non-blank elements, therefore it will not be blank any more. If
blank elements are important for your analysis, you can use the
‘isblank’ operator with the ‘where’ operator to set them back to
blank after filtering.
For example in the command below, we are first filtering the image,
then setting its original blank elements back to blank in the
output of filtering (all within one Arithmetic command). Note how
we are using the ‘set-’ operator to give names to the temporary
outputs of steps and simplify the code (see *note Operand storage
in memory or a file::).
$ astarithmetic image.fits -h1 set-in \
5 4 in filter-mean set-filtered \
filtered in isblank nan where \
--output=out.fits
‘filter-median’
Apply median filtering
(https://en.wikipedia.org/wiki/Median_filter) on the input dataset.
This is very similar to ‘filter-mean’, except that instead of the
mean value of the box pixels, the median value is used to replace a
pixel value. For more on how to use this operator, please see
‘filter-mean’.
The median is less susceptible to outliers compared to the mean.
As a result, after median filtering, the pixel values will be more
discontinuous than mean filtering.
‘filter-sigclip-mean’
Apply a $\sigma$-clipped mean filtering onto the input dataset.
This is very similar to ‘filter-mean’, except that all outliers
(identified by the $\sigma$-clipping algorithm) have been removed,
see *note Sigma clipping:: for more on the basics of this
algorithm. As described there, two extra input parameters are
necessary for $\sigma$-clipping: the multiple of $\sigma$ and the
termination criteria. ‘filter-sigclip-mean’ therefore needs to pop
two other operands from the stack after the dimensions of the box.
For example, the line below uses the same box size as the example
of ‘filter-mean’. However, all elements in the box that are
iteratively beyond $3\sigma$ of the distribution's median are
removed from the final calculation of the mean until the change in
$\sigma$ is less than $0.2$.
$ astarithmetic 3 0.2 5 4 image.fits filter-sigclip-mean
The median (which needs a sorted dataset) is necessary for
$\sigma$-clipping, therefore ‘filter-sigclip-mean’ can be
significantly slower than ‘filter-mean’. However, if there are
strong outliers in the dataset that you want to ignore (for
example, emission lines on a spectrum when finding the continuum),
this is a much better solution.
‘filter-sigclip-median’
Apply a $\sigma$-clipped median filtering onto the input dataset.
This operator and its necessary operands are almost identical to
‘filter-sigclip-mean’, except that after $\sigma$-clipping, the
median value (which is less affected by outliers than the mean) is
added back to the stack.
File: gnuastro.info, Node: Pooling operators, Next: Interpolation operators, Prev: Filtering operators, Up: Arithmetic operators
6.2.4.9 Pooling operators
.........................
Pooling is one way of reducing the complexity of the input image by
grouping multiple input pixels into one output pixel (using any
statistical measure). As a result, the output image has fewer pixels
(less complexity). In Computer Vision, Pooling is commonly used in
Convolutional Neural Networks
(https://en.wikipedia.org/wiki/Convolutional_neural_network) (CNNs).
In pooling, the inputs are an image (e.g., a FITS file) and a square
window pixel size that is known as a pooling window. The window has to
be smaller than the input's number of pixels in both dimensions and its
width is called the "pool size". The pooling window starts at the
top-left corner pixel of the input and calculates statistical operations
on the pixels that overlap with it. It slides forward by the "stride"
pixels, moving over all pixels in the input from the top-left corner to
the bottom-right corner, and repeats the same calculation for the
overlapping pixels in each position.
Usually, the stride (or spacing between the windows as they slide
over the input) is equal to the window-size. In other words, in
pooling, the separate "windows" do not overlap with each other on the
input. However, you can choose any size for the stride. Remember this,
It's crucial to ensure that the stride size is less than the pool size.
If not, some pixels may be missed during the pooling process. Therefore
there are two major differences with *note Spatial domain convolution::
or *note Filtering operators::, but pooling has some similarities to the
*note Warp::.
• In convolution or filtering the input and output sizes are the
same. However, when the stride is larger than 1 then, the output
of pooling must have fewer pixels.
• In convolution or filters, the kernels slide over the input in a
pixel-by-pixel manner. As a result, the same pixel's value will be
used in many of the output pixels. However, in pooling each input
pixel may be only used in a single output pixel (if the stride and
the pool size are the same).
• Special cases of Warping an image are similar to pooling. For
example calling ‘pool-sum’ with pool size of 2 will give the same
pixel values (except the outer edges) as giving the same image to
‘astwarp’ with ‘--scale=1/2 --centeroncorner’. However, warping
will only provide the sum of the input pixels, there is no easy way
to generically define something like ‘pool-max’ in Warp (which is
far more general than pooling). Also, due to its generic features
(for example for non-linear warps), Warp is slower than the
‘pool-max’ that is introduced here.
*No WCS in output:* As of Gnuastro 0.22, the output of pooling will not
contain WCS information (primarily due to a lack of time by developers).
Please inform us of your interest in having it, by contacting us at
‘bug-gnuastro@gnu.org’. If you need ‘pool-sum’, you can use *note
Warp:: (which also modifies the WCS, see note above).
If the width or height of input is not divisible by the stride size,
the pool window will go beyond the input pixel grid. In this case, the
window pixels that do not overlap with the input are given a blank value
(and thus ignored in the calculation of the desired statistical
operation).
The simple ASCII figure below shows the pooling operation where the
input is a $3\times3$ pixel image with a pool size of 2 pixels. In the
center of the second row, you see the intermediate input matrix that
highlights how the input and output pixels relate with each other.
Since the input is $3\times3$ and we have a stride size of 2, as
mentioned above blank pseudo-pixels are added with a value of ‘B’ (for
blank).
Pool window: Input:
+-----------+ +-------------+
| | | | 10 12 9 |
| _ _ | _ _ |___________________________| 31 4 1 |
| | | || || | 16 5 8 |
| | | || || +-------------+
+-----------+ || ||
The pooling window 2*2 || ||
stride 2 \/ \/
+---------------------+
|/ 10 12\|/ 9 B \|
| | |
+-------+ pool-min |\ 31 4 /|\ 1 B /| pool-max +-------+
| 4 1 | /------ |---------------------| ------\ |31 9 |
| 5 8 | \------ |/ 16 5 \|/ 8 B \| ------/ |16 8 |
+-------+ | | | +-------+
|\ B B /.\ B B /|
+---------------------+
The choice of the statistic to use depends on the specific use case,
the characteristics of the input data, and the desired output. Each
statistic has its advantages and disadvantages and the choice of which
to use should be informed by the specific needs of the problem at hand.
Below, the various pool operators of arithmetic are listed:
‘pool-max’
Apply max-pooling on the input dataset. This operator takes three
operands: the first popped operand is the stride and the second is
the width of the square pooling window (which should be a single
integer). Also, The third operand should be the input image.
Within the pooling window, this operator will place the largest
value in the output pixel (any blank pixels will be ignored).
See the ASCII diagram above for a demonstration of how max-pooling
works. Here is an example of using this operator:
$ astarithmetic image.fits 2 2 pool-max
Max-pooling retains the largest value of the input window in the
output, so the returned image is sharper where you have strong
signal-to-noise ratio and more noisy in regions with no significant
signal (only noise). It is therefore useful when the background of
the image is dark and we are interested in only the highest
signal-to-noise ratio regions of the image.
‘pool-min’
Apply min-pooling on the input dataset. This operator takes three
operands: the first popped operand is the stride and the second is
the width of the square pooling window (which should be a single
integer). Also, The third operand should be the input image.
Except the used statistical measurement, this operator is similar
to ‘pool-max’, see the description there for more.
Min-pooling is mostly used when the image has a high
signal-to-noise ratio and a light background: min-pooling will
select darker (lower-valued) pixels. For low signal-to-noise
regions, this operator will increase the noise level (similar to
the maximum, the scatter in the minimum is very strong).
‘pool-sum’
Apply sum-pooling to the input dataset. This operator takes three
operands: the first popped operand is the stride and the second is
the width of the square pooling window (which should be a single
integer). Also, The third operand should be the input image.
Except the used statistical measurement, this operator is similar
to ‘pool-max’, see the description there for more.
Sum-pooling will increase the signal-to-noise ratio at the cost of
having a smoother output (less resolution).
‘pool-mean’
Apply mean pooling on the input dataset. This operator takes three
operands: the first popped operand is the stride and the second is
the width of the square pooling window (which should be a single
integer). Also, The third operand should be the input image.
Except the used statistical measurement, this operator is similar
to ‘pool-max’, see the description there for more.
The mean pooling method smooths out the image and hence the sharp
features may not be identified when this pooling method is used.
This therefore preserves more information than max-pooling, but may
also reduces the effect of the most prominent pixels. Mean is
often used where a more accurate representation of the input is
required.
‘pool-median’
Apply median pooling on the input dataset. This operator takes
three operands: the first popped operand is the stride and the
second is the width of the square pooling window (which should be a
single integer). Also, The third operand should be the input
image. Except the used statistical measurement, this operator is
similar to ‘pool-max’, see the description there for more.
In general, the mean is mathematically easier to interpret and more
susceptible to outliers, while the median outputs as being less
subject to the influence of outliers compared to the mean so we
have a smoother image. This is therefore better for low
signal-to-ratio (noisy) features and extended features (where you
don't want a single high or low valued pixel to affect the output).
File: gnuastro.info, Node: Interpolation operators, Next: Dimensionality changing operators, Prev: Pooling operators, Up: Arithmetic operators
6.2.4.10 Interpolation operators
................................
Interpolation is the process of removing blank pixels from a dataset (by
giving them a value based on the non-blank neighbors).
‘interpolate-medianngb’
Interpolate the blank elements of the second popped operand with
the median of nearest non-blank neighbors to each. The number of
the nearest non-blank neighbors used to calculate the median is
given by the first popped operand.
The distance of the nearest non-blank neighbors is irrelevant in
this interpolation. The neighbors of each blank pixel will be
parsed in expanding circular rings (for 2D images) or spherical
surfaces (for 3D cube) and each non-blank element over them is
stored in memory. When the requested number of non-blank neighbors
have been found, their median is used to replace that blank
element. For example, the line below replaces each blank element
with the median of the nearest 5 pixels.
$ astarithmetic image.fits 5 interpolate-medianngb
When you want to interpolate blank regions and you want each blank
region to have a fixed value (for example, the centers of saturated
stars) this operator is not good. Because the pixels used to
interpolate various parts of the region differ. For such
scenarios, you may use ‘interpolate-maxofregion’ or
‘interpolate-inofregion’ (described below).
‘interpolate-meanngb’
Similar to ‘interpolate-medianngb’, but will fill the blank values
of the dataset with the mean value of the requested number of
nearest neighbors.
‘interpolate-minngb’
Similar to ‘interpolate-medianngb’, but will fill the blank values
of the dataset with the minimum value of the requested number of
nearest neighbors.
‘interpolate-maxngb’
Similar to ‘interpolate-medianngb’, but will fill the blank values
of the dataset with the maximum value of the requested number of
nearest neighbors. One useful implementation of this operator is
to fill the saturated pixels of stars in images.
‘interpolate-minofregion’
Interpolate all blank regions (consisting of many blank pixels that
are touching) in the second popped operand with the minimum value
of the pixels that are immediately bordering that region (a single
value). The first popped operand is the connectivity (see
description in ‘connected-components’).
For example, with the command below all the connected blank regions
of ‘image.fits’ will be filled. Its an image (2D dataset), so a 2
connectivity means that the independent blank regions are defined
by 8-connected neighbors. If connectivity was 1, the regions would
be defined by 4-connectivity: blank regions that may only be
touching on the corner of one pixel would be identified as separate
regions.
$ astarithmetic image.fits 2 interpolate-minofregion
‘interpolate-maxofregion’
Similar to ‘interpolate-minofregion’, but the maximum is used to
fill the blank regions.
This operator can be useful in filling saturated pixels in stars
for example. Recall that the ‘interpolate-maxngb’ operator looks
for the maximum value with a given number of neighboring pixels and
is more useful in small noisy regions. Therefore as the blank
regions become larger, ‘interpolate-maxngb’ can cause a
fragmentation in the connected blank region because the nearest
neighbor to one part of the blank region, may not fall within the
pixels searched for the other regions. With this option, the size
of the blank region is irrelevant: all the pixels bordering the
blank region are parsed and their maximum value is used for the
whole region.
File: gnuastro.info, Node: Dimensionality changing operators, Next: Conditional operators, Prev: Interpolation operators, Up: Arithmetic operators
6.2.4.11 Dimensionality changing operators
..........................................
Through these operators you can change the dimensions of the output
through certain statistics on the dimensions that should be removed.
For example, let's assume you have a 3D data cube that has 300 by 300
pixels in the RA and Dec dimensions (first two dimensions), and 3600
slices along the wavelength (third dimension), so the whole cube is
$300\times300\times3600$ voxels (volume elements). To create a
narrow-band image that only contains 100 slices around a certain
wavelength, you can crop that section (using *note Crop::), giving you a
$300\times300\times100$ cube. You can now use the ‘collapse-sum’
operator below to "collapse" all the 100 slices into one 2D image that
has $300\times300$ pixels. Every pixel in this 2D image will have the
flux of the sum of the 100 slices.
‘to-1d’
Convert the input operand into a 1D array; irrespective of the
number of dimensions it has. This operator only takes a single
operand (the input array) and just updates the metadata. Therefore
it does not change the layout of the array contents in memory and
is very fast.
If no further operation is requested on the 1D array, recall that
Arithmetic will write a 1D array as a table column by default. In
case you want the output to be saved as a 1D image, or to see it on
the standard output, please use the ‘--onedasimage’ or
‘--onedonstdout’ options respectively (see *note Invoking
astarithmetic::).
This operator is useful in scenarios where after some operations on
a 2D image or 3D cube, the dimensionality is no longer relevant for
you and you just care about the values. In the example below, we
will first make a simple 2D image from a plain-text file, then
convert it to a 1D array:
## Contents of 'a.txt' to start with.
$ cat a.txt
# Image 1: DEMO [counts, uint8] An example image
1 2 3
4 5 6
7 8 9
## Convert the text image into a FITS image.
$ astconvertt a.txt -o a.fits
## Convert it into a table column (1D):
$ astarithmetic a.fits to-1d -o table.fits
## Convert it into a 1D image:
$ astarithmetic a.fits to-1d -o table.fits --onedasimage
A more real-world example would be the following: assume you want
to "flatten" two images into a single 1D array (as commonly done in
convolutional neural networks, or CNNs(1)). First, we show the
contents of a new $2\times2$ image in plain-text image, then
convert it to a 2D FITS image (‘b.fits’). We will then use
arithmetic to make both ‘a.fits’ (from the example above) and
‘b.fits’ into a 1D array and stitch them together into a single 1D
image with one call to Arithmetic. For a description of the
‘stitch’ operator, see below (same section).
## Contents of 'b.txt':
$ cat b.txt
# Image 1: DEMO [counts, uint8] An example image
10 11
12 13
## Convert the text image into a FITS image.
$ astconvertt b.txt -o b.fits
# Flatten the two images into a single 1D image:
$ astarithmetic a.fits to-1d b.fits to-1d 2 1 stitch -g1 \
--onedonstdout --quiet
1
2
3
4
5
6
7
8
9
10
11
12
13
‘stitch’
Stitch (connect) any number of given images together along the
given dimension. The output has the same number of dimensions as
the input, but the number of pixels along the requested dimension
will be different from the inputs. The ‘stitch’ operator takes at
least three operands:
• The first popped operand (placed just before ‘stitch’) is the
direction (dimension) that the images should be stitched
along. The first FITS dimension is along the horizontal,
therefore a value of ‘1’ will stitch them horizontally.
Similarly, giving a value of ‘2’ will result in a vertical
stitch.
• The second popped operand is the number of images that should
be stitched.
• Depending on the value given to the second popped operand,
‘stitch’ will pop the given number of datasets from the stack
and stitch them along the given dimension. The popped images
have to have the same number of pixels along the other
dimension. The order of the stitching is defined by how they
are placed in the command-line, not how they are popped (after
being popped, they are placed in a list in the same order).
For example, in the commands below, we will first crop out fixed
sized regions of $100\times300$ pixels of a larger image
(‘large.fits’) first. In the first call of Arithmetic below, we
will stitch the bottom set of crops together along the first
(horizontal) axis. In the second Arithmetic call, we will stitch
all 6 along both dimensions.
## Crop the fixed-size regions of a larger image ('-O' is the
## short form of the '--mode' option).
$ astcrop large.fits -Oimg --section=1:100,1:300 -oa.fits
$ astcrop large.fits -Oimg --section=101:200,1:300 -ob.fits
$ astcrop large.fits -Oimg --section=201:300,1:300 -oc.fits
$ astcrop large.fits -Oimg --section=1:100,301:600 -od.fits
$ astcrop large.fits -Oimg --section=101:200,301:600 -oe.fits
$ astcrop large.fits -Oimg --section=201:300,301:600 -of.fits
## Stitch the bottom three crops into one image.
$ astarithmetic a.fits b.fits c.fits 3 1 stitch -obottom.fits
# Stitch all the 6 crops along both dimensions
$ astarithmetic a.fits b.fits c.fits 3 1 stitch \
d.fits e.fits f.fits 3 1 stitch \
2 2 stitch -g1 -oall.fits
The start of the last command is like the one before it (stitching
the bottom three crops along the first FITS dimension, producing a
$300\times300$ image). Later in the same command, we then stitch
the top three crops horizontally (again, into a $300\times300$
image) This leaves the the two $300\times300$ images on the stack
(see *note Reverse polish notation::). We finally stitch those two
along the second (vertical) dimension. This operator is therefore
useful in scenarios like placing the CCD amplifiers into one image.
‘trim’
Trim all blank elements from the outer edges of the input operand
(it only takes a single operand). For example see the commands
below using Table's *note Column arithmetic:::
$ cat table.txt
nan
nan
nan
3
4
nan
5
6
nan
$ asttable table.txt -Y -c'arith $1 trim'
3.000000
4.000000
nan
5.000000
6.000000
Similarly, on 2D images or 3D cubes, all outer rows/columns or
slices that are fully blank get "trim"ed with this operator. This
is therefore a very useful operator for extracting a certain
feature within your dataset.
For example, let's assume that you have set *note NoiseChisel:: and
*note Segment:: on an image to extract all clumps and objects.
With the command below on Segment's output, you will have a smaller
image that only contains the sky-subtracted input pixels
corresponding to object 263.
$ astarithmetic seg.fits -hINPUT-NO-SKY seg.fits -hOBJECTS \
263 ne nan where trim --output=obj-263.fits
‘add-dimension-slow’
Build a higher-dimensional dataset from all the input datasets
stacked after one another (along the slowest dimension). The first
popped operand has to be a single number. It is used by the
operator to know how many operands it should pop from the stack
(and the size of the output in the new dimension). The rest of the
operands must have the same size and numerical data type. This
operator currently only works for 2D input operands, please contact
us if you want inputs to have different dimensions.
The output's WCS (which should have a different dimensionality
compared to the inputs) can be read from another file with the
‘--wcsfile’ option. If no file is specified for the WCS, the first
dataset's WCS will be used, you can later add/change the necessary
WCS keywords with the FITS keyword modification features of the
Fits program (see *note Fits::).
If your datasets do not have the same type, you can use the type
transformation operators of Arithmetic that are discussed below.
Just beware of overflow if you are transforming to a smaller type,
see *note Numeric data types::.
For example, let's assume you have 3 two-dimensional images
‘a.fits’, ‘b.fits’ and ‘c.fits’ (each with $200\times100$ pixels).
You can construct a 3D data cube with $200\times100\times3$ voxels
(volume-pixels) using the command below:
$ astarithmetic a.fits b.fits c.fits 3 add-dimension-slow
‘add-dimension-fast’
Similar to ‘add-dimension-slow’ but along the fastest dimension.
This operator currently only works for 1D input operands, please
contact us if you want inputs to have different dimensions.
For example, let's assume you have 3 one-dimensional datasets, each
with 100 elements. With this operator, you can construct a
$3\times100$ pixel FITS image that has 3 pixels along the
horizontal and 5 pixels along the vertical.
‘collapse-sum’
Collapse the given dataset (second popped operand), by summing all
elements along the first popped operand (a dimension in FITS
standard: counting from one, from fastest dimension). The returned
dataset has one dimension less compared to the input.
The output will have a double-precision floating point type
irrespective of the input dataset's type. Doing the operation in
double-precision (64-bit) floating point will help the collapse
(summation) be affected less by floating point errors. But
afterwards, single-precision floating points are usually enough in
real (noisy) datasets. So depending on the type of the input and
its nature, it is recommended to use one of the type conversion
operators on the returned dataset.
If any WCS is present, the returned dataset will also lack the
respective dimension in its WCS matrix. Therefore, when the WCS is
important for later processing, be sure that the input is aligned
with the respective axes: all non-diagonal elements in the WCS
matrix are zero.
One common application of this operator is the creation of pseudo
broad-band or narrow-band 2D images from 3D data cubes. For
example, integral field unit (IFU) data products that have two
spatial dimensions (first two FITS dimensions) and one spectral
dimension (third FITS dimension). The command below will collapse
the whole third dimension into a 2D array the size of the first two
dimensions, and then convert the output to single-precision
floating point (as discussed above).
$ astarithmetic cube.fits 3 collapse-sum float32
‘collapse-min’
‘collapse-max’
‘collapse-mean’
‘collapse-median’
Similar to ‘collapse-sum’, but the returned dataset will be the
desired statistic along the collapsed dimension, not the sum.
‘collapse-madclip-fill-mad’
‘collapse-madclip-fill-std’
‘collapse-madclip-fill-mean’
‘collapse-madclip-fill-median’
‘collapse-madclip-fill-number’
Collapse the input dataset (fourth popped operand) along the FITS
dimension given as the first popped operand by calculating the
desired statistic after median absolute deviation (MAD) filled
re-clipping. The MAD-clipping parameters (namely, the multiple of
sigma and termination criteria) are read as the third and second
popped operands respectively.
This is the most robust method to reject outliers; for more on
filled re-clipping and its advantages, see *note Filled
re-clipping::. For a more general tutorial on rejecting outliers,
see *note Clipping outliers::. If you have not done this tutorial
yet, we recommend you to take an hour or so and go through that
tutorial for optimal understanding and results.
For example, with the command below, the pixels of the input 2
dimensional ‘image.fits’ will be collapsed to a single dimension
output. The first popped operand is ‘2’, so it will collapse all
the pixels that are vertically on top of each other. Such that the
output will have the same number of pixels as the horizontal axis
of the input. During the collapsing, all pixels that are more than
$3\sigma$ (third popped operand) are rejected, and the clipping
will continue until the standard deviation changes less than $0.2$
between clips. Finally the ‘counter’ operator is used to have a
two-column table with the first one being a simple counter starting
from one (see *note Size and position operators::).
$ astarithmetic image.fits 3 0.2 2 collapse-sigclip-mean \
counter --output=collapsed-vertical.fits
*Printing output of collapse in plain-text:* the default datatype
of ‘collapse-sigclip-mean’ is 32-bit floating point. This is
sufficient for any observed astronomical data. However, if you
request a plain-text output, or decide to print/view the output as
plain-text on the standard output, the full set of decimals may not
be printed in some situations. This can lead to apparently
discrete values in the output of this operator when viewed in
plain-text! The FITS format is always superior (since it stores
the value in binary, therefore not having the problem above). But
if you are forced to save the output in plain-text, use the
‘float64’ operator after this to change the type to 64-bit floating
point (which will print more decimals).
‘collapse-madclip-mad’
‘collapse-madclip-std’
‘collapse-madclip-mean’
‘collapse-madclip-median’
‘collapse-madclip-number’
Collapse the input dataset (fourth popped operand) along the FITS
dimension given as the first popped operand by calculating the
desired statistic after median absolute deviation (MAD) clipping.
This operator is called similarly to the ‘collapse-madclip-fill-*’
operators, see the description there for more.
‘collapse-sigclip-fill-mad’
‘collapse-sigclip-fill-std’
‘collapse-sigclip-fill-mean’
‘collapse-sigclip-fill-median’
‘collapse-sigclip-fill-number’
Collapse the input dataset (fourth popped operand) along the FITS
dimension given as the first popped operand by calculating the
desired statistic after filled $\sigma$ re-clipping. This operator
is called similarly to the ‘collapse-madclip-fill-*’ operators, see
the description there for more.
‘collapse-sigclip-mad’
‘collapse-sigclip-std’
‘collapse-sigclip-mean’
‘collapse-sigclip-median’
‘collapse-sigclip-number’
Collapse the input dataset (fourth popped operand) along the FITS
dimension given as the first popped operand by calculating the
desired statistic after $\sigma$-clipping. This operator is called
similarly to the ‘collapse-madclip-fill-*’ operators, see the
description there for more.
---------- Footnotes ----------
(1) <https://en.wikipedia.org/wiki/Convolutional_neural_network>
File: gnuastro.info, Node: Conditional operators, Next: Mathematical morphology operators, Prev: Dimensionality changing operators, Up: Arithmetic operators
6.2.4.12 Conditional operators
..............................
Conditional operators take two inputs and return a binary output that
can only have two values 0 (for pixels where the condition was false) or
1 (for the pixels where the condition was true). Because of the binary
(2-valued) nature of their outputs, the output is therefore stored in an
‘unsigned char’ data type (see *note Numeric data types::) to speed up
process and take less space in your storage. There are two exceptions
to the general features above: ‘isblank’ only takes one input, and
‘where’ takes three, while not returning a binary output, see their
description for more.
‘lt’
Less than: creates a binary output (values either 0 or 1) where
each pixel will be 1 if the second popped operand is smaller than
the first popped operand and 0 otherwise. If both operands are
images, then all the pixels will be compared with their
counterparts in the other image.
For example, the pixels in the output of the command below will
have a value of 1 (true) if their value in ‘image1.fits’ is less
than their value in ‘image2.fits’. Otherwise, their value will be
0 (false).
$ astarithmetic image1.fits image2.fits lt
If only one operand is an image, then all the pixels will be
compared with the single value (number) of the other operand. For
example:
$ astarithmetic image1.fits 1000 lt
Finally if both are numbers, then the output is also just one
number (0 or 1).
$ astarithmetic 4 5 lt
‘le’
Less or equal: similar to ‘lt’ ('less than' operator), but
returning 1 when the second popped operand is smaller or equal to
the first. For example
$ astarithmetic image1.fits 1000 le
‘gt’
Greater than: similar to ‘lt’ ('less than' operator), but returning
1 when the second popped operand is greater than the first. For
example
$ astarithmetic image1.fits 1000 gt
‘ge’
Greater or equal: similar to ‘lt’ ('less than' operator), but
returning 1 when the second popped operand is larger or equal to
the first. For example
$ astarithmetic image1.fits 1000 ge
‘eq’
Equality: similar to ‘lt’ ('less than' operator), but returning 1
when the two popped operands are equal (to double precision
floating point accuracy).
$ astarithmetic image1.fits 1000 eq
‘ne’
Non-Equality: similar to ‘lt’ ('less than' operator), but returning
1 when the two popped operands are _not_ equal (to double precision
floating point accuracy).
$ astarithmetic image1.fits 1000 ne
‘and’
Logical AND: returns 1 if both operands have a non-zero value and 0
if both are zero. Both operands have to be the same kind: either
both images or both numbers and it mostly makes meaningful values
when the inputs are binary (with pixel values of 0 or 1).
$ astarithmetic image1.fits image2.fits -g1 and
For example, if you only want to see which pixels in an image have
a value _between_ 50 (greater equal, or inclusive) and 200 (less
than, or exclusive), you can use this command:
$ astarithmetic image.fits set-i i 50 ge i 200 lt and
‘or’
Logical OR: returns 1 if either one of the operands is non-zero and
0 only when both operators are zero. Both operands have to be the
same kind: either both images or both numbers. The usage is
similar to ‘and’.
For example, if you only want to see which pixels in an image have
a value _outside of_ -100 (greater equal, or inclusive) and 200
(less than, or exclusive), you can use this command:
$ astarithmetic image.fits set-i i -100 lt i 200 ge or
‘not’
Logical NOT: returns 1 when the operand is 0 and 0 when the operand
is non-zero. The operand can be an image or number, for an image,
it is applied to each pixel separately. For example, if you want
to know which pixels are not blank (and assuming that we didn't
have the ‘isnotblank’ operator), you can use this ‘not’ operator on
the output of the ‘isblank’ operator described below:
$ astarithmetic image.fits isblank not
‘isblank’
Test each pixel for being a blank value (see *note Blank pixels::).
This is a conditional operator: the output has the same size and
dimensions as the input, but has an unsigned 8-bit integer type
with two possible values: either 1 (for a pixel that was blank) or
0 (for a pixel that was not blank). See the description of ‘lt’
operator above). The difference is that it only needs one operand.
For example:
$ astarithmetic image.fits isblank
Because of the definition of a blank pixel, a blank value is not
even equal to itself, so you cannot use the equal operator above to
select blank pixels. See the "Blank pixels" box below for more on
Blank pixels in Arithmetic.
In case you want to set non-blank pixels to an output pixel value
of 1, it is better to use ‘isnotblank’ instead of '‘isblank not’'
(for more, see the description of ‘isnotblank’).
‘isnotblank’
The inverse of the ‘isblank’ operator above (see that description
for more). Therefore, if a pixel has a blank value, the output of
this operator will have a 0 value for it. This operator is
therefore similar to running '‘isblank not’', but slightly more
efficient (won't need the intermediate product of two operators).
‘where’
Change the input (pixel) value _where_/if a certain condition
holds. The conditional operators above can be used to define the
condition. Three operands are required for ‘where’. The input
format is demonstrated in this simplified example:
$ astarithmetic modify.fits binary.fits if-true.fits where
The value of any pixel in ‘modify.fits’ that corresponds to a
non-zero _and_ non-blank pixel of ‘binary.fits’ will be changed to
the value of the same pixel in ‘if-true.fits’ (this may also be a
number). The 3rd and 2nd popped operands (‘modify.fits’ and
‘binary.fits’ respectively, see *note Reverse polish notation::)
have to have the same dimensions/size. ‘if-true.fits’ can be
either a number, or have the same dimension/size as the other two.
The 2nd popped operand (‘binary.fits’) has to have ‘uint8’ (or
‘unsigned char’ in standard C) type (see *note Numeric data
types::). It is treated as a binary dataset (with only two values:
zero and non-zero, hence the name ‘binary.fits’ in this example).
However, commonly you will not be dealing with an actual FITS file
of a condition/binary image. You will probably define the
condition in the same run based on some other reference image and
use the conditional and logical operators above to make a
true/false (or one/zero) image for you internally. For example,
the case below:
$ astarithmetic in.fits reference.fits 100 gt new.fits where
In the example above, any of the ‘in.fits’ pixels that has a value
in ‘reference.fits’ greater than ‘100’, will be replaced with the
corresponding pixel in ‘new.fits’. Effectively the ‘reference.fits
100 gt’ part created the condition/binary image which was added to
the stack (in memory) and later used by ‘where’. The command above
is thus equivalent to these two commands:
$ astarithmetic reference.fits 100 gt --output=binary.fits
$ astarithmetic in.fits binary.fits new.fits where
Finally, the input operands are read and used independently, so you
can use the same file more than once as any of the operands.
When the 1st popped operand to ‘where’ (‘if-true.fits’) is a single
number, it may be a NaN value (or any blank value, depending on its
type) like the example below (see *note Blank pixels::). When the
number is blank, it will be converted to the blank value of the
type of the 3rd popped operand (‘in.fits’). Hence, in the example
below, all the pixels in ‘reference.fits’ that have a value greater
than 100, will become blank in the natural data type of ‘in.fits’
(even though NaN values are only defined for floating point types).
$ astarithmetic in.fits reference.fits 100 gt nan where
File: gnuastro.info, Node: Mathematical morphology operators, Next: Bitwise operators, Prev: Conditional operators, Up: Arithmetic operators
6.2.4.13 Mathematical morphology operators
..........................................
From Wikipedia: "Mathematical morphology (MM) is a theory and technique
for the analysis and processing of geometrical structures, based on set
theory, lattice theory, topology, and random functions. MM is most
commonly applied to digital images". In theory it extends a very large
body of research and methods in image processing, but currently in
Gnuastro it mainly applies to images that are binary (only have a value
of 0 or 1). For example, you have applied the greater-than operator
(‘gt’, see *note Conditional operators::) to select all pixels in your
image that are larger than a value of 100. But they will all have a
value of 1, and you want to separate the various groups of pixels that
are connected (for example, peaks of stars in your image). With the
‘connected-components’ operator, you can give each connected region of
the output of ‘gt’ a separate integer label.
‘erode’
Erode the foreground pixels (with value ‘1’) of the input dataset
(second popped operand). The first popped operand is the
connectivity (see description in ‘connected-components’). Erosion
is simply a flipping of all foreground pixels (with value ‘1’) to
background (with value ‘0’) that are "touching" background pixels.
"Touching" is defined by the connectivity.
In effect, this operator "carves off" the outer borders of the
foreground, making them thinner. This operator assumes a binary
dataset (all pixels are ‘0’ or ‘1’). For example, imagine that you
have an astronomical image with a mean/sky value of 0 units and a
standard deviation ($\sigma$) of 100 units and many galaxies in it.
With the first command below, you can apply a threshold of
$2\sigma$ on the image (by only keeping pixels that are greater
than 200 using the ‘gt’ operator). The output of thresholding the
image is a binary image (each pixel is either smaller or equal to
the threshold or larger than it). You can then erode the binary
image with the second command below to remove very small false
positives (one or two pixel peaks).
$ astarithmetic image.fits 100 gt -obinary.fits
$ astarithmetic binary.fits 2 erode -oout.fits
In fact, you can merge these operations into one command thanks to
the reverse polish notation (see *note Reverse polish notation::):
$ astarithmetic image.fits 100 gt 2 erode -oout.fits
To see the effect of connectivity, try this:
$ astarithmetic image.fits 100 gt 1 erode -oout-con-1.fits
‘dilate’
Dilate the foreground pixels (with value ‘1’) of the binary input
dataset (second popped operand). The first popped operand is the
connectivity (see description in ‘connected-components’). Dilation
is simply a flipping of all background pixels (with value ‘0’) to
foreground (with value ‘1’) that are "touching" foreground pixels.
"Touching" is defined by the connectivity. In effect, this expands
the outer borders of the foreground. This operator assumes a
binary dataset (all pixels are ‘0’ and ‘1’). The usage is similar
to ‘erode’, for example:
$ astarithmetic binary.fits 2 dilate -oout.fits
‘number-neighbors’
Return a dataset of the same size as the second popped operand, but
where each non-zero and non-blank input pixel is replaced with the
number of its non-zero and non-blank neighbors. The first popped
operand is the connectivity (see above) and must be a single-value
of an integer type. The dataset is assumed to be binary (having an
unsigned, 8-bit dataset).
For example with the command below, you can select all pixels above
a value of 100 in your image with the "greater-than" or ‘gt’
operator (see *note Conditional operators::). Recall that the
output of all conditional operators is a binary output (having a
value of 0 or 1). In the same command, we will then find how many
neighboring pixels of each pixel (that was originally above the
threshold) are also above the threshold.
$ astarithmetic image.fits 100 gt 2 number-neighbors
‘connected-components’
Find the connected components in the input dataset (second popped
operand). The first popped is the connectivity used in the
connected components algorithm. The second popped operand is the
dataset where connected components are to be found. It is assumed
to be a binary image (with values of 0 or 1). It must have an
8-bit unsigned integer type which is the format produced by
conditional operators. This operator will return a labeled dataset
where the non-zero pixels in the input will be labeled with a
counter (starting from 1).
The connectivity is a number between 1 and the number of dimensions
in the dataset (inclusive). 1 corresponds to the weakest
(symmetric) connectivity between elements and the number of
dimensions the strongest. For example, on a 2D image, a
connectivity of 1 corresponds to 4-connected neighbors and 2
corresponds to 8-connected neighbors.
One example usage of this operator can be the identification of
regions above a certain threshold, as in the command below. With
this command, Arithmetic will first separate all pixels greater
than 100 into a binary image (where pixels with a value of 1 are
above that value). Afterwards, it will label all those that are
connected.
$ astarithmetic in.fits 100 gt 2 connected-components
If your input dataset does not have a binary type, but you know all
its values are 0 or 1, you can use the ‘uint8’ operator (below) to
convert it to binary.
‘fill-holes’
Flip background (0) pixels surrounded by foreground (1) in a binary
dataset. This operator takes two operands (similar to
‘connected-components’): the second is the binary (0 or 1 valued)
dataset to fill holes in and the first popped operand is the
connectivity (to define a hole). Imagine that in your dataset
there are some holes with zero value inside the objects with one
value (for example, the output of the thresholding example of
‘erode’) and you want to fill the holes:
$ astarithmetic binary.fits 2 fill-holes
‘invert’
Invert an unsigned integer dataset (will not work on other data
types, see *note Numeric data types::). This is the only operator
that ignores blank values (which are set to be the maximum values
in the unsigned integer types).
This is useful in cases where the target(s) has(have) been imaged
in absorption as raw formats (which are unsigned integer types).
With this option, the maximum value for the given type will be
subtracted from each pixel value, thus "inverting" the image, so
the target(s) can be treated as emission. This can be useful when
the higher-level analysis methods/tools only work on emission
(positive skew in the noise, not negative).
$ astarithmetic image.fits invert
File: gnuastro.info, Node: Bitwise operators, Next: Numerical type conversion operators, Prev: Mathematical morphology operators, Up: Arithmetic operators
6.2.4.14 Bitwise operators
..........................
Astronomical images are usually stored as an array multi-byte pixels
with different sizes for different precision levels (see *note Numeric
data types::). For example, images from CCDs are usually in the
unsigned 16-bit integer type (each pixel takes 16 bits, or 2 bytes, of
memory) and fully reduced deep images have a 32-bit floating point type
(each pixel takes 32 bits or 4 bytes).
On the other hand, during the data reduction, we need to preserve a
lot of meta-data about some pixels. For example, if a cosmic ray had
hit the pixel during the exposure, or if the pixel was saturated, or is
known to have a problem, or if the optical vignetting is too strong on
it. A crude solution is to make a new image when checking for each one
of these things and make a binary image where we flag (set to 1) pixels
that satisfy any of these conditions above, and set the rest to zero.
However, processing pipelines sometimes need more than 20 flags to store
important per-pixel meta-data, and recall that the smallest numeric data
type is one byte (or 8 bits, that can store up to 256 different values),
while we only need two values for each flag! This is a major waste of
storage space!
A much more optimal solution is to use the bits within each pixel to
store different flags! In other words, if you have an 8-bit pixel, use
each bit as a flag to mark if a certain condition has happened on a
certain pixel or not. For example, let's set the following standard
based on the four cases mentioned above: the first bit will show that a
cosmic ray has hit that pixel. So if a pixel is only affected by cosmic
rays, it will have this sequence of bits (note that the bit-counting
starts from the right): ‘00000001’. The second bit shows that the pixel
was saturated (‘00000010’), the third bit shows that it has known
problems (‘00000100’) and the fourth bit shows that it was affected by
vignetting (‘00001000’).
Since each bit is independent, we can thus mark multiple metadata
about that pixel in the actual image, within a single "flag" or "mask"
pixel of a flag or mask image that has the same number of pixels. For
example, a flag-pixel with the following bits ‘00001001’ shows that it
has been affected by cosmic rays _and_ it has been affected by
vignetting at the same time. The common data type to store these
flagging pixels are unsigned integer types (see *note Numeric data
types::). Therefore when you open an unsigned 8-bit flag image in a
viewer like DS9, you will see a single integer in each pixel that
actually has 8 layers of metadata in it! For example, the integer you
will see for the bit sequences given above will respectively be: $2^0=1$
(for a pixel that only has cosmic ray), $2^1=2$ (for a pixel that was
only saturated), $2^2=4$ (for a pixel that only has known problems),
$2^3=8$ (for a pixel that is only affected by vignetting) and $2^0 + 2^3
= 9$ (for a pixel that has a cosmic ray _and_ was affected by
vignetting).
You can later use this bit information to mark objects in your final
analysis or to mask certain pixels. For example, you may want to set
all pixels affected by vignetting to NaN, but can interpolate over
cosmic rays. You therefore need ways to separate the pixels with a
desired flag(s) from the rest. It is possible to treat a flag pixel as
a single integer (and try to define certain ranges in value to select
certain flags). But a much more easier and robust way is to actually
look at each pixel as a sequence of bits (not as a single integer!) and
use the bitwise operators below for this job. For more on the theory
behind bitwise operators, see Wikipedia
(https://en.wikipedia.org/wiki/Bitwise_operation).
‘bitand’
Bitwise AND operator: only bits with values of 1 in both popped
operands will get the value of 1, the rest will be set to 0. For
example, (assuming numbers can be written as bit strings on the
command-line): ‘00101000 00100010 bitand’ will give ‘00100000’.
Note that the bitwise operators only work on integer type datasets.
‘bitor’
Bitwise inclusive OR operator: The bits where at least one of the
two popped operands has a 1 value get a value of 1, the others 0.
For example, (assuming numbers can be written as bit strings on the
command-line): ‘00101000 00100010 bitand’ will give ‘00101010’.
Note that the bitwise operators only work on integer type datasets.
‘bitxor’
Bitwise exclusive OR operator: A bit will be 1 if it differs
between the two popped operands. For example, (assuming numbers
can be written as bit strings on the command-line): ‘00101000
00100010 bitand’ will give ‘00001010’. Note that the bitwise
operators only work on integer type datasets.
‘lshift’
Bitwise left shift operator: shift all the bits of the first
operand to the left by a number of times given by the second
operand. For example, (assuming numbers can be written as bit
strings on the command-line): ‘00101000 2 lshift’ will give
‘10100000’. This is equivalent to multiplication by 4. Note that
the bitwise operators only work on integer type datasets.
‘rshift’
Bitwise right shift operator: shift all the bits of the first
operand to the right by a number of times given by the second
operand. For example, (assuming numbers can be written as bit
strings on the command-line): ‘00101000 2 rshift’ will give
‘00001010’. Note that the bitwise operators only work on integer
type datasets.
‘bitnot’
Bitwise not (more formally known as one's complement) operator:
flip all the bits of the popped operand (note that this is the only
unary, or single operand, bitwise operator). In other words, any
bit with a value of ‘0’ is changed to ‘1’ and vice-versa. For
example, (assuming numbers can be written as bit strings on the
command-line): ‘00101000 bitnot’ will give ‘11010111’. Note that
the bitwise operators only work on integer type datasets/numbers.
File: gnuastro.info, Node: Numerical type conversion operators, Next: Random number generators, Prev: Bitwise operators, Up: Arithmetic operators
6.2.4.15 Numerical type conversion operators
............................................
With the operators below you can convert the numerical data type of your
input, see *note Numeric data types::. Type conversion is particularly
useful when dealing with integers, see *note Integer benefits and
pitfalls::.
As an example, let's assume that your colleague gives you many single
exposure images for processing, but they have a double-precision
floating point type! You know that the statistical error a
single-exposure image can never exceed 6 or 7 significant digits, so you
would prefer to archive them as a single-precision floating point and
save space on your computer (a double-precision floating point is also
double the file size!). You can do this with the ‘float32’ operator
described below.
‘u8’
‘uint8’
Convert the type of the popped operand to 8-bit unsigned integer
type (see *note Numeric data types::). The internal conversion of
C will be used.
‘i8’
‘int8’
Convert the type of the popped operand to 8-bit signed integer type
(see *note Numeric data types::). The internal conversion of C
will be used.
‘u16’
‘uint16’
Convert the type of the popped operand to 16-bit unsigned integer
type (see *note Numeric data types::). The internal conversion of
C will be used.
‘i16’
‘int16’
Convert the type of the popped operand to 16-bit signed integer
(see *note Numeric data types::). The internal conversion of C
will be used.
‘u32’
‘uint32’
Convert the type of the popped operand to 32-bit unsigned integer
type (see *note Numeric data types::). The internal conversion of
C will be used.
‘i32’
‘int32’
Convert the type of the popped operand to 32-bit signed integer
type (see *note Numeric data types::). The internal conversion of
C will be used.
‘u64’
‘uint64’
Convert the type of the popped operand to 64-bit unsigned integer
(see *note Numeric data types::). The internal conversion of C
will be used.
‘f32’
‘float32’
Convert the type of the popped operand to 32-bit (single precision)
floating point (see *note Numeric data types::). The internal
conversion of C will be used. For example, if ‘f64.fits’ is a
64-bit floating point image, and you want to store it as a 32-bit
floating point image, you can use the command below (the second
command is to show that the output file consumes half the storage)
$ astarithmetic f64.fits float32 --output=f32.fits
$ ls -lh f64.fits f32.fits
‘f64’
‘float64’
Convert the type of the popped operand to 64-bit (double precision)
floating point (see *note Numeric data types::). The internal
conversion of C will be used.
File: gnuastro.info, Node: Random number generators, Next: Coordinate and border operators, Prev: Numerical type conversion operators, Up: Arithmetic operators
6.2.4.16 Random number generators
.................................
When you simulate data (for example, see *note Sufi simulates a
detection::), everything is ideal and there is no noise! The final step
of the process is to add simulated noise to the data. The operators in
this section are designed for that purpose. To learn more about the
definition and implementation "noise", see *note Noise basics::.
In case each data element's random distribution should have an
independent parameter (for example $\sigma$ in a Gaussian distribution),
the first popped operand can be a dataset of the same size as the
second. In this case (when the parameter is not a single value, but an
array), each element will have a different parameter.
When ‘--quiet’ is not given, a statement will be printed on each
invocation of these operators (if there are multiple calls to the
‘mknoise-*’ operators, the statement will be printed multiple times).
It will show the random number generator function and seed that was used
in that invocation. These are necessary for the future reproducibility
of the outputs using the ‘--envseed’ option, for more, see *note
Generating random numbers::. For example, with the first command below,
‘image.fits’ will be degraded by a noise of standard deviation 3 units.
$ astarithmetic image.fits 3 mknoise-sigma
Alternatively, you can use the operators in this section within the
*note Column arithmetic:: feature of the Table program. For example,
with the command below, you can generate a random number (centered on 0,
with $\sigma=3$). With the second command, you can put it into a shell
variable for later usage.
$ echo 0 | asttable -c'arith $1 3 mknoise-sigma'
$ value=$(echo 0 | asttable -c'arith $1 3 mknoise-sigma' --quiet)
$ echo $value
You can also use the operators here in combination with AWK to easily
generate an arbitrarily large table with random columns. In the example
below, we will create a two column table with 20 rows. The first column
will be centered on 5 and $\sigma_1=2$, the second will be centered on
10 and $\sigma_2=3$:
$ echo 5 10 \
| awk '{for(i=0;i<20;++i) print $1, $2}' \
| asttable -c'arith $1 2 mknoise-sigma' \
-c'arith $2 3 mknoise-sigma'
By adding an extra ‘--output=random.fits’, the table will be saved
into a file called ‘random.fits’, and you can change the ‘i<20’ to
‘i<5000’ to have 5000 rows instead. Of course, if your input table has
different values in the desired column the noisy distribution will be
centered on each input element, but all will have the same
scatter/sigma.
As mentioned above, you can use the ‘--envseed’ option to pre-define
the random number generator seed (and thus get a reproducible result).
For more on ‘--envseed’, see *note Generating random numbers::. When
using column arithmetic in Table, it may happen that multiple columns
need random numbers (with any of the ‘mknoise-*’ operators) in one call
of ‘asttable’. In such cases, the value given to ‘GSL_RNG_SEED’ is
incremented by one on every call to the ‘mknoise-*’ operators. Without
this increment, when the column values are the same (happens a lot, for
no-noised datasets), the returned values for all columns will be
identical. But this feature has a side-effect: that if the order of
calling the ‘mknoise-*’ operators changes, the seeds used for each
operator will change(1).
‘mknoise-sigma’
Add a Gaussian noise with pre-defined $\sigma$ to each element of
the input dataset (independent of the input pixel value). $\sigma$
is the standard deviation of the Gaussian or Normal distribution
(https://en.wikipedia.org/wiki/Normal_distribution). This operator
takes two arguments: the top/first popped operand is the noise
standard deviation, the next popped operand is the dataset that the
noise should be added to.
For example, with the first command below, let's put a Sérsic
profile with Sérsic index 1 and effective radius 10 pixels,
truncated at 5 times the effective radius in the center of a mock
image that is $100\times100$ pixels wide. We will also give it a
position angle of 45 degrees and an axis ratio of 0.8, and set it
to have a total electron count of 10000 (‘1e4’ in the command).
Note that this example is focused on this operator, for a robust
simulation, see the tutorial in *note Sufi simulates a detection::.
With the second command, let's add noise to this image and with the
third command, we'll subtract the raw image from the noised image.
Finally, we'll view them both together:
$ echo "1 50 50 1 10 1 45 0.8 1e4 5" \
| astmkprof --mergedsize=100,100 --oversample=1 \
--mcolissum --output=raw.fits
$ astarithmetic raw.fits 2 mknoise-sigma --output=sigma.fits
$ astarithmetic raw.fits sigma.fits - -g1 \
--output=diff-sigma.fits
$ astscript-fits-view raw.fits sigma.fits diff-sigma.fits
You see that the final ‘diff-sigma.fits’ distribution was
independent of the pixel values of the input. You will also notice
that within ‘sigma.fits’ the noisy pixels that had a zero value in
‘raw.fits’, the noise fluctuates around zero (is negative in half
of those pixels). These behaviors will be different in the case
for ‘mknoise-sigma-from-mean’ below, which is more "realistic" (or
Poisson-like).
‘mknoise-sigma-from-mean’
Replace each input element (e.g., pixel in an image) of the input
with a random value taken from a Gaussian distribution (for pixel
$i$) with mean $\mu_i$ and standard deviation $\sigma_i$. Where,
$\sigma_i=\sqrt{I_i+B_i}$ and $\mu_i=I_i+B_i$ and $I_i$ and $B_i$
are respectively the values of the input image, and background in
that same pixel. In other words, this can be seen as approximating
a Poisson distribution at high mean values (where the Poisson
distribution becomes identical to the Gaussian distribution).
This operator takes two arguments: 1. the first popped operand
(just before the operator) is the _per-pixel_ background value (in
units of electron counts). 2. The second popped operand is the
dataset that the noise should be added to.
To demonstrate the effect of this noise pattern, please run the
example commands in the description of ‘mknoise-sigma’. With the
first command below, let's add this Poisson-like noise (assuming a
background level of 4 electron counts, to be similar to a
$\sigma=2$ of the example in ‘mknoise-sigma’). With the second
command, let's subtract the raw image from this noise pattern:
$ astarithmetic raw.fits 4 mknoise-sigma-from-mean \
--output=sigma-from-mean.fits
$ astarithmetic raw.fits sigma-from-mean.fits - -g1 \
--output=diff-sigma-from-mean.fits
$ astscript-fits-view diff-sigma.fits \
diff-sigma-from-mean.fits
You clearly see how the noise in the center of the Sérsic profile
is much stronger than the outer parts. As described, above, this
is behavior we would expect in a "real" observation: the regions
with stronger signal, also have stronger noise as defined through
the Poisson distribution
(https://en.wikipedia.org/wiki/Poisson_distribution)! The reason
we described this operator as "Poisson-like" is that, it has some
shortcomings as opposed to the ‘mknoise-poisson’ operator (that is
described below):
• For low mean values (less than 3 for example), this will
produce a symmetric Gaussian distribution, while the Poisson
distribution will not be symmetric.
• The random values from this distribution are floating point
(unlike the Poisson distribution that produces integers.
• The random values can be negative (which is not possible in a
Poisson distribution).
Therefore to simulate photon-starved images (for example UV or
X-ray data), the ‘mknoise-poisson’ operator should always be used,
not this one. However, in optical (or redder bands) data, the
background is very bright (much brighter than 10 counts for
example). In such cases (as the mean increases), the Poisson
distributions becomes identical to the Gaussian distribution.
Furthermore, processed co-add/stacked images are no longer
integers, but floating points with the Sky-level already subtracted
(see *note Sky value::). Therefore if you are trying to simulate a
processed, photon-rich dataset, you can safely use this operator.
Recall that the background values reported by observatories (for
example, to define dark or gray nights), or in papers, is usually
reported in units of magnitudes per arcseconds square. You need to
do the conversion to counts per pixel manually. The conversion of
magnitudes to counts is described below. For converting arcseconds
squared to number of pixels, you can use the ‘--pixelscale’ option
of *note Fits::. For example, ‘astfits image.fits --pixelscale’.
Except for the noise-model, this operator is very similar to
‘mknoise-sigma’ and the examples there apply here too. The main
difference with ‘mknoise-sigma’ is that in a Poisson distribution
the scatter/sigma will depend on each element's value.
For example, let's assume you have made a mock image called
‘mock.fits’ with *note MakeProfiles:: and it is assumed zero point
is 22.5 (for more on the zero point, see *note Brightness flux
magnitude::). Let's assume the background level for the Poisson
noise has a value of 19 magnitudes. You can first use the
‘mag-to-counts’ operator to convert this background magnitude into
counts, then feed the background value in counts to
‘mknoise-sigma-from-mean’ operator:
$ astarithmetic mock.fits 19 22.5 mag-to-counts \
mknoise-sigma-from-mean
Try changing the background value from 19 to 10 to see the effect!
Recall that the tutorial *note Sufi simulates a detection:: shows
how you can use MakeProfiles to build mock images.
‘mknoise-poisson’
Replace every pixel of the input with a random integer taken from a
Poisson distribution with the mean value of that input pixel.
Similar to ‘mknoise-sigma-from-mean’, it takes two operands: 1.
The first popped operand (just before the operator) is the
per-pixel background value (in units of electron counts). 2. The
second popped operand is the dataset that the noise should be added
to.
To demonstrate this noise pattern, let's use ‘mknoise-poisson’ in
the example of the description of ‘mknoise-sigma-from-mean’ with
the first command below. The second command below will show you
the two images side-by-side, you will notice that the Poisson
distribution's un-detected regions are slightly darker (this is
because of the skewness of the Poisson distribution). Finally,
with the last two commands, you can see the histograms of the two
distributions:
$ astarithmetic raw.fits 4 mknoise-poisson \
--output=poisson.fits
$ astscript-fits-view sigma-from-mean.fits poisson.fits
$ aststatistics sigma-from-mean.fits --lessthan=10
-------
Histogram:
| ***
| ******
| **********
| ***********
| **************
| ****************
| ******************
| **********************
| **************************
| ********************************** *
|* **********************************************************
|------------------------------------------------------------
$ aststatistics poisson.fits --lessthan=10
-------
Histogram:
| * *
| * *
| * * *
| * * * *
| * * * * *
| * * * * *
| * * * * * * *
| * * * * * * *
| * * * * * * * *
| * * * * * * * *
| * * * * * * * *
|------------------------------------------------------------
The extra skewness in the Poisson distribution, and the fact that
it only returns integers is therefore clear with the commands
above. The comparison was further made above in the description of
‘mknoise-sigma-from-mean’. In summary, you should prefer the
Poisson distribution when you are simulating the following
scenarios:
• A photon-starved image (as in UV or X-ray).
• A raw exposure of a photon-rich image (which may be
photon-rich, but always integers).
‘mknoise-uniform’
Add uniform noise to each element of the input dataset. This
operator takes two arguments: the top/first popped operand is the
width of the interval, the second popped operand is the dataset
that the noise should be added to (each element will be the center
of the interval). The returned random values may happen to be the
minimum interval value, but will never be the maximum. Except for
the noise-model, this operator behaves very similar to
‘mknoise-sigma’, see the explanation there for more.
For example, with the command below, a random value will be
selected between 10 to 14 (centered on 12, which is the only input
data element, with a total width of 4).
echo 12 | asttable -c'arith $1 4 mknoise-uniform'
Similar to the example in ‘mknoise-sigma’, you can pipe the output
of ‘echo’ to ‘awk’ before passing it to ‘asttable’ to generate a
full column of uniformly selected values within the same interval.
‘random-from-hist-raw’
Generate random values from a custom distribution (defined by a
histogram). The output will have a double-precision floating point
type (see *note Numeric data types::). This operator takes three
operands:
• The first popped operand (nearest to the operator) is the
histogram values. The histogram is a 1-dimensional dataset (a
table column) and contains the probability of obtaining a
certain interval of values. The histogram does not have to be
normalized: the GNU Scientific Library (or GSL, which is used
by Gnuastro for this operator), will normalize it internally.
The value of each bin (whose probability is given in the
histogram) is given in the second popped operand. Therefore
these two operands have to have the same number of rows.
• The second popped operand is the bin value (mostly the bin
center, but it can be anything). The probability of each bin
is defined in the histogram operand (first popped operand).
The bins can have any width (do not have to be evenly spaced),
and any order. Just make sure that the same row in the bins
column corresponds to the same row in the histogram: the
number of rows in the bins and histogram must be equal.
• The third popped operand is the dataset that the random values
should be written over. Effectively only its size will be
used by this operator (all values will be over-written as a
double-precision floating point number).
The first two operands have to be single-dimensional (a table
column) and have the same number of rows, but the last popped
operand can have any number of dimensions. You can use the
‘load-col-’ operator to load the two bins and histogram columns
from an external file (see *note Loading external columns::).
For example, in the command below, we first construct a fake
histogram to represent a $y=x^2$ distribution with AWK. We aim to
distribute random values from this distribution in a $100\times100$
image. Therefore, we use the ‘makenew’ operator to construct an
empty image of that size, use the ‘load-col-’ operator to load the
histogram columns into Arithmetic and put the output in
‘random.fits’. Finally we visually inspect ‘random.fits’ with DS9
and also have a look at its pixel distribution with
‘aststatistics’.
$ echo "" | awk '{for(i=1;i<5;++i) print i, i*i}' \
> histogram.txt
$ cat histogram.txt
1 1
2 4
3 9
4 16
$ astarithmetic 100 100 2 makenew \
load-col-1-from-histogram.txt \
load-col-2-from-histogram.txt \
random-from-hist-raw \
--output=random.fits
$ astscript-fits-view random.fits
$ aststatistics random.fits --asciihist --numasciibins=50
| *
| *
| *
| *
| * *
| * *
| * *
| * * *
| * * *
|* * * *
|* * * *
|--------------------------------------------------
As you see, the 10000 pixels in the image only have values 1, 2, 3
or 4 (which were the values in the bins column of ‘histogram.txt’),
and the number of times each of these values occurs follows the
$y=x^2$ distribution.
Generally, any value given in the bins column will be used for the
final output values. For example, in the command below (for
generating a histogram from an analytical function), we are adding
the bins by 20 (while keeping the same probability distribution of
$y=x^2$). If you re-run the Arithmetic command above after this,
you will notice that the pixels values are now one of the following
21, 22, 23 or 24 (instead of 1, 2, 3, or 4). But the shape of the
histogram of the resulting random distribution will be unchanged.
$ echo "" | awk '{for(i=1;i<5;++i) print 20+i, i*i}' \
> histogram.txt
If you do not want the outputs to have exactly the value of the bin
identifier, but be a randomly selected value from a uniform
distribution within the bin, you should use ‘random-from-hist’ (see
below).
As mentioned above, the output will have a double-precision
floating point type (see *note Numeric data types::). Therefore,
by default each element of the output will consume 8 bytes
(64-bits) of storage. This is usually far more than the
statistical error/precision of your data (and just results in
wasted storage in your file system, or wasted RAM when a program
that uses the data is being run, and a slower running time of the
program).
It is therefore recommended to use a type-conversion operator after
this operator to put the output in the smallest type that can be
used to safely store your data without wasting storage, RAM or
time. For the list of type conversion operators, see *note
Numerical type conversion operators::. Recall that you already
know the values returned by this operator (they are one of the
values in the bins column).
For example, in the example above, the whole image only has values
1, 2, 3 or 4. Since they are always positive and are below 255, we
can safely place them in an unsigned 8-bit integer (see *note
Numeric data types::) with the command below (note the ‘uint8’
after the operator name, and that we are using a different name for
the output). After building the new image, let's have a look at
the sizes of the two images with ‘ls -l’:
$ astarithmetic 100 100 2 makenew \
load-col-1-from-histogram.txt \
load-col-2-from-histogram.txt \
random-from-hist-raw uint8 \
--output=random-u8.fits
$ ls -lh random.fits random-u8.fits
-rw-r--r-- 1 name name 85K Jan 01 13:40 random.fits
-rw-r--r-- 1 name name 17K Jan 01 13:45 random-u8.fits
As you see, when using a suitable data type, we can shrink the size
of the file significantly without loosing any information (from 85
kilobytes to 17 kilobytes). This difference can be felt much
better for larger (real-world) datasets, so be sure to always set
the output data type after calling this operator.
‘random-from-hist’
Similar to ‘random-from-hist-raw’, but do not return the exact bin
value, instead return a random value from a uniform distribution
within each bin. Therefore the following limitations have to be
taken into account (compared to ‘random-from-hist-raw’):
• The number associated with each bin (in the bin column) should
be its center.
• The bins have to be in descending order (so the second row in
the bin column is larger than the first).
• The bin widths (distance from one bin to another) have to be
fixed.
For a demonstration, let's replace ‘random-from-hist-raw’ with
‘random-from-hist’ in the example of the description of
‘random-from-hist-raw’. Note how we are manually converting the
output of this operator into single-precision floating point
(32-bit, since the default 64-bit precision is statistically
meaningless in this scenario and we do not want to waste storage,
memory and running time):
$ echo "" | awk '{for(i=1;i<5;++i) print i, i*i}' \
> histogram.txt
$ astarithmetic 100 100 2 makenew \
load-col-1-from-histogram.txt \
load-col-2-from-histogram.txt \
random-from-hist float32 \
--output=random.fits
$ aststatistics random.fits --asciihist --numasciibins=50
| *
| *** ********
| ************
| *************
| * * *************
| * ***********************
| *************************
| *************************
| *************************************
|********* * **************************************
|**************************************************
|--------------------------------------------------
You can see that the pixels of ‘histogram.fits’ are no longer just
1, 2, 3 or 4. Instead, the values within each bin are selected
from a uniform distribution covering that bin. This creates the
step-like feature in the histogram of the output.
Of course, this extra uniform random number generation can make
your program slower so be sure to check if it is worth it. In
particular, one way to avoid this (and use ‘random-from-hist-raw’
with a more contiguous-looking output distribution) is to simply
use a higher-resolution histogram (assuming it is possible: you
have a sufficient number of data points, or you have an analytical
expression that you can sample at smaller bin sizes).
To better demonstrate this operator and its practical usage in
everyday research, let's look at another example: Assume you want
to get 100 random star magnitudes that follow the real-world Gaia
Data release 3 magnitude distribution within a radius of 2 degrees
around the (RA,Dec) coordinate of (1.23,4.56). Let's further
assume that you want to distribute them uniformly over an image of
size 1000 by 1000 pixels. So your desired output table should have
three columns, the first two are pixel positions of each star, and
the third is the magnitude.
First, we need to query the Gaia database and ask for all the
magnitudes in this region of the sky. We know that Gaia is not
complete for stars fainter than the 20th magnitude, so we will use
the ‘--range’ option and only ask for those stars that are brighter
than magnitude 20.
$ astquery gaia --dataset=dr3 --center=1.23,3.45 --radius=2 \
--column=phot_g_mean_mag --output=gaia.fits \
--range=phot_g_mean_mag,-inf,20
We now have more than 25000 magnitudes in ‘gaia.fits’! To get a
more accurate random sampling of our stars, let's construct a
histogram with 500 bins, and generate our three desired randomly
selected columns:
$ aststatistics gaia.fits --histogram --numbins=500 \
--output=gaia-hist.fits
$ asttable gaia-hist.fits -i
$ echo 1000 \
| awk '{for(i=0;i<100;++i) print $1/2}' \
| asttable -c'arith $1 500 mknoise-uniform' \
-c'arith $1 500 mknoise-uniform' \
-c'arith $1 \
load-col-1-from-gaia-hist.fits-hdu-1 \
load-col-2-from-gaia-hist.fits-hdu-1 \
random-from-hist float32'
These columns can easily be placed in the format for *note
MakeProfiles:: to be inserted into an image automatically.
---------- Footnotes ----------
(1) We have defined Task 15971 (https://savannah.gnu.org/task/?15971)
in Gnuastro's project management system to address this. If you need
this feature please send us an email at ‘bug-gnuastro@gnu.org’ (to
motivate us in its implementation).
File: gnuastro.info, Node: Coordinate and border operators, Next: Loading external columns, Prev: Random number generators, Up: Arithmetic operators
6.2.4.17 Coordinate and border operators
........................................
The operators here help you in defining or manipulating coordinates.
For examples to define the "box" (a rectangular region) that surrounds
an ellipse or to rotate a point around a reference point.
‘rotate-coord’
Rotate the given point (horizontal and vertical coordinates given
in 5th and 4th popped operands) around a center/reference point
(coordinates given in the 3rd and 2nd popped operands) by a given
angle (first popped operand).
For example, if you want to trace the outer edge of a circle
centered on (1.23,45.6) with a radius of 0.78, you can use this
operator like below. The logic is that we assume a single point
that is located on 0.78 units after the center on the horizontal
axis (the point's vertical axis position is the same as the
center). We then rotate this point in each row by one degree to
build the circle's circumference.
$ cx=1.23
$ cy=45.6
$ rad=0.78
$ seq 0 360 \
| awk '{print '$rad'+'$cx', '$cy', $1}' \
| asttable -c'arith $1 $2 '$cx' '$cy' $3 rotate-coord' \
--output=circle.fits
## Within TOPCAT, after opening "Plane Plot", within "Axes" select
## "Aspect lock" so the steps in both axis is the same.
$ astscript-fits-view circle.fits
If you want the points to create a circle on the celestial sphere,
you can use the ‘eq-j2000-from-flat’ operator after this one (see
*note Column arithmetic::):
$ seq 0 360 \
| awk '{print '$rad'+'$cx', '$cy', $1}' \
| asttable -c'arith $1 $2 '$cx' '$cy' $3 rotate-coord \
'$cx' '$cy' TAN eq-j2000-from-flat' \
--output=circle-on-sky.fits
When you open TOPCAT, if you open the "Plane Plot", you will see an
ellipse. However, if you open "Sky Plot" (from the "Graphics"
menu), and select the first and second columns respectively, you
will see a circle.
The center coordinates and angle can be fixed for all the rows (as
in the example above) or be different for every row. Recall that
if you want these to change on every row, you should give the
column name (or number followed by ‘$’) for these operands instead
of the constant number above.
‘box-around-ellipse’
Return the width (along horizontal) and height (along vertical) of
a box that encompasses an ellipse with the same center point. The
top-popped operand is assumed to be the position angle (angle from
the horizontal axis) in _degrees_. The second and third popped
operands are the minor and major radii of the ellipse respectively.
This operator outputs two operands on the general stack. The first
one is the width and the second (which will be the top one when
this operator finishes) is the height.
If the value to the second popped operand (minor axis) is larger
than the third (major axis), a NaN value will be written for both
the width and height of that element and a warning will be printed
(the warning can be disabled with the ‘--quiet’ option).
As an example, if your ellipse has a major axis radius of 10 units,
a minor axis radius of 4 units and a position angle of 20 degrees,
you can estimate the bounding box with this command:
$ echo "10 4 20" \
| asttable -c'arith $1 $2 $3 box-around-ellipse'
Alternatively if your three values are in separate FITS
arrays/images, you can use the command below to have the width and
height in similarly sized fits arrays. In this example ‘a.fits’
and ‘b.fits’ are respectively the major and minor axis lengths and
‘pa.fits’ is the position angle (in degrees). Also, in all three,
we assume the first extension is used. After it is done, the
height of the box will be put in ‘h.fits’ and the width will be in
‘w.fits’. Just note that because this operator has two output
datasets, you need to first write the height (top output operand)
into a file and free it with the ‘tofilefree-’ operator, then write
the width in the file given to ‘--output’.
$ astarithmetic a.fits b.fits pa.fits box-around-ellipse \
tofilefree-h.fits -ow.fits -g1
Finally, if you need to treat the width and height separately for
further processing, you can call the ‘set-’ operator two times
afterwards like below. Recall that the ‘set-’ operator will pop
the top operand, and put it in memory with a certain name, bringing
the next operand to the top of the stack.
For example, let's assume ‘catalog.fits’ has at least three columns
‘MAJOR’, ‘MINOR’ and ‘PA’ which specify the major axis, minor axis
and position angle respectively. But you want the final width and
height in 32-bit floating point numbers (not the default 64-bit,
which may be too much precision in many scenarios). You can do
this with the command below (note you can also break lines with
<\>, within the single-quote environment)
$ asttable catalog.fits \
-c'arith MAJOR MINOR PA box-around-ellipse \
set-height set-width \
width float32 height float32'
‘box-vertices-on-sphere’
Convert a box center and width to the coordinates of the vertices
of the box on a left-hand spherical coordinate system. In a
left-handed spherical coordinate system, the longitude increases
towards the left while north is up (as in the RA and Dec direction
of the equatorial coordinate system used in astronomy). This
operator therefore takes four input operands (the RA and Dec of the
box's center, as well as the width of the box in each direction).
After it is complete, this operator places 8 operands on the stack
which contain the RA and Dec of the four vertices of the box in the
following anti-clockwise order:
1. Bottom-left vertice Longitude (RA)
2. Bottom-left vertice Latitude (Dec)
3. Bottom-right vertice Longitude (RA)
4. Bottom-right vertice Latitude (Dec)
5. Top-right vertice Longitude (RA)
6. Top-right vertice Latitude (Dec)
7. Top-left vertice Longitude (RA)
8. Top-left vertice Latitude (Dec)
For example, with the command below, we will retrieve the vertice
coordinates of a rectangle around a point with RA=20 and Dec=0 (on
the equator). The rectangle will have a 1 degree edge along the RA
direction and a 2 degree edge along the declination. In this
example, we are using the ‘-Afixed -B2’ only for demonstration
purposes here due to the round numbers! In general, it is best to
write your outputs to a binary FITS table to preserve the full
precision (see *note Printing floating point numbers::).
$ echo "20 0 1 2" \
| asttable -Afixed -B2 \
-c'arith $1 $2 $3 $4 box-vertices-on-sphere'
20.50 -1.00 19.50 -1.00 19.50 1.00 20.50 1.00
We see that the bottom-left vertice is at (RA,Dec) of
$(20.50,-1.0)$ and the top-right vertice is at $(19.50,1.00)$.
These could have easily been done by manually adding and
subtracting! But you will see that the complexity arises at
higher/lower declinations. For example, with the command below,
let's see how vertice coordinates of the same box, but after moving
its center to (RA,Dec) of (20,85):
$ echo "20 85 1 2" \
| asttable -Afixed -B2 \
-c'arith $1 $2 $3 $4 box-vertices-on-sphere'
24.78 84.00 15.22 84.00 12.83 86.00 27.17 86.00
Even though, we didn't change the central RA (20) or the size of
the box along the RA (1 degree), the RA of the bottom-left vertice
is now at 24.78; almost 5 degrees away! This occurs because of the
spherical coordinate system, we measure the longitude (e.g., RA)
with the following way:
1. Draw a meridian that passes your point. The meridian is half
of a great-circle (https://en.wikipedia.org/wiki/Great_circle)
(which has a diameter that is equal to the sphere's diameter)
passes both poles.
2. Find the intersection of that meridian with the equator.
3. The distance of the intersection and the reference point
(along the equator) defines the longitude angle.
As you get more distant from the equator (declination becomes
non-zero), any change along the RA (towards the east; 1 degree in
the example above) will on longer be on a great circle, but along a
"small circle (https://en.wikipedia.org/wiki/Circle_of_a_sphere)".
On a small circle that is defined by the fixed declination
$\delta$, the distance of two points is closer than the distances
of their projection on the equator (as described in the definition
of longitude above). It is smaller by a factor of
$\cos({\delta})$.
Therefore, an angular change (let's call it $\Delta_{lon}$) along
the small circle defined by the fixed declination of $\delta$
corresponds to $\Delta_{lon}/\cos(\delta)$ on the equator.
File: gnuastro.info, Node: Loading external columns, Next: Size and position operators, Prev: Coordinate and border operators, Up: Arithmetic operators
6.2.4.18 Loading external columns
.................................
In the Arithmetic program, you can always load new dataset by simply
giving their name. However, they can only be images, not a column. In
the Table program, you can load columns in *note Column arithmetic::,
but it has to be columns within the same table (and thus the same number
of rows). However, in some situations, it is necessary to use certain
columns of a table in the Arithmetic program, or columns of different
rows (from the main input) in Table.
‘load-col-%-from-%’
‘load-col-%-from-%-hdu-%’
Load the requested column (first ‘%’) from the requested file
(second ‘%’). If the file is a FITS file, it is also necessary to
specify a HDU using the second form (where the HDU identifier is
the third ‘%’. For example, ‘load-col-MAG-from-catalog.fits-hdu-1’
will load the ‘MAG’ column from HDU 1 of ‘catalog.fits’.
For example, let's assume you have the following two tables, and
you would like to add the first column of the first with the
second:
$ asttable tab-1.fits
1 43.23
2 21.91
3 71.28
4 18.10
$ cat tab-2.txt
5
6
7
8
$ asttable tab-1.txt -c'arith $1 load-col-1-from-tab-2.txt +'
6
8
10
12
File: gnuastro.info, Node: Size and position operators, Next: Building new dataset and stack management, Prev: Loading external columns, Up: Arithmetic operators
6.2.4.19 Size and position operators
....................................
With the operators below you can get metadata about the top dataset on
the stack.
‘index’
Add a new operand to the stack with an integer type and the same
size (in all dimensions) as top operand on the stack (before it was
called; it is not popped!). The first pixel in the returned
operand is zero, and every later pixel's value is incremented by
one. It is important to remember that the top operand is not
popped by this operand, so it remains on the stack. After this
operand is finished, it adds a new operand to the stack. To pop
the previous operand, you can use the ‘indexonly’ operator.
The data type of the output is always an unsigned integer, and its
width is determined from the number of pixels/rows in the top
operand. For example if there are only 108 rows in a table, the
returned column will have an unsigned 8-bit integer type (that can
keep 256 separate values). But if the top operand is a
$1000\times1000=10^6$ pixel image, the output will be a 32-bit
unsigned integer. For the various types of integers, see *note
Numeric data types::.
To see the index image along with the actual image, you can use the
‘--writeall’ operator to have a multi-HDU output (without
‘--writeall’, Arithmetic will complain if more than one operand is
left at the end). After DS9 opens with the second command, flip
between the two extensions.
$ astarithmetic image.fits index --writeall
$ astscript-fits-view image_arith.fits
Below is a review some usage examples of this operator:
Image: masking margins
With the command below, we will be masking all pixels that are
20 pixels away from the edges of the image (on the margin).
Here is a description of the command below (for the basics of
Arithmetic's notation, see *note Reverse polish notation::):
• The ‘index’ operator just adds a new dataset on the
stack: unlike almost all other operators in Arithmetic,
‘index’ doesn't remove its input dataset from the stack
(use ‘indexonly’ for the "normal" behavior). This is
because ‘index’ returns the pixel metadata not data. As
a result, after ‘index’, we have two operands on the
stack: the input image and the index image.
• With the ‘set-i’ operator, the top operand (the image
containing the index of each pixel) is popped from the
stack and associated to the name ‘i’. Therefore after
this, the stack only has the input image. For more on
the ‘set-’ operator, see *note Operand storage in memory
or a file::.
• We need three values from the commands before Arithmetic
(for the width and height of the image and the size of
the margin). To make the rest of the command easier to
read/use, we'll define them in Arithmetic as three named
operators (respectively called ‘w’, ‘h’ and ‘m’). All
three are integers that will have a positive value lower
than $2^{16}=65536$ (for a "normal" image!). Therefore,
we will store them as 16-bit unsigned integers with the
‘uint16’ operator (this will help optimal processing in
later steps). For more the type changing operators, see
*note Numerical type conversion operators::.
• Using the modulo ‘%’ and division (‘/’) operators on the
index image and the width, we extract the horizontal (X)
and vertical (Y) positions of each pixel in separately
named operands called ‘X’ and ‘Y’. The maximum value in
these two will also fit within an unsigned 16-bit
integer, so we'll also store these in that type.
• For the horizontal (X) dimension, we select pixels that
are less than the margin (‘X m lt’) and those that are
more than the width subtracted by the margin (‘X w m -
gt’).
• The output of the ‘lt’ and ‘gt’ conditional operators
above is a binary (0 or 1 valued) image. We therefore
merge them into one binary image using the ‘or’ operator.
For more, see *note Conditional operators::.
• We repeat the two steps above for the vertical (Y)
dimension.
• Once the images containing the to-be-masked pixels in
each dimension are made, we combine them into one binary
image with a final ‘or’ operator. At this point, the
stack only has two operands: 1) the input image and 2)
the binary image that has a value of 1 for all pixels
whose value should be changed.
• A single-element operand (‘nan’) is added on the stack.
• Using the ‘where’ operator, we replace all the pixels
that are non-zero in the second operand (on the margins)
to the top operand's value (NaN) in the third popped
operand (image that was read from ‘image.fits’). For
more on the ‘where’ operator, see *note Conditional
operators::.
$ margin=20
$ width=$(astfits image.fits --keyvalue=NAXIS1 -q)
$ height=$(astfits image.fits --keyvalue=NAXIS2 -q)
$ astarithmetic image.fits index set-i \
$width uint16 set-w \
$height uint16 set-h \
$margin uint16 set-m \
i w % uint16 set-X \
i w / uint16 set-Y \
X m lt X w m - gt or \
Y m lt Y h m - gt or \
or nan where
Image: Masking regions outside a circle
As another example for usage on an image, in the command below
we are using ‘index’ to define an image where each pixel
contains the distance to the pixel with X,Y coordinates of
345,250. We are then using that distance image to only keep
the pixels that are within a 50 pixel radius of that point.
The basic concept behind this process is very similar to the
previous example, with a different mathematical definition for
pixels to mask. The major difference is that we want the
distance to a pixel within the image, we need to have negative
values and the center coordinates can be in a sub-pixel
positions. The best numeric datatype for intermediate steps
is therefore floating point. 64-bit floating point can have a
precision of up to 15 digits after the decimal point. This is
far too much for what we need here: in astronomical imaging,
the PSF is usually on the scale of 1 or more pixels (see *note
Sampling theorem::). So even reaching a precision of one
millionth of a pixel (offered by 32-bit floating points) is
beyond our wildest dreams (see *note Numeric data types::).
We will also define the horizontal (X) and vertical (Y)
operands after shifting to the desired central point.
$ radius=50
$ centerx=345.2
$ centery=250.3
$ width=$(astfits image.fits --keyvalue=NAXIS1 -q)
$ astarithmetic image.fits index set-i \
$width uint16 set-w \
$radius float32 set-r \
$centerx float32 set-cx \
$centery float32 set-cy \
i w % cx - set-X \
i w / cy - set-Y \
X X x Y Y x + sqrt r gt \
nan where --output=arith-masked.fits
*Optimal data types have significant benefits:* choosing the
minimum required datatype for your operation is very important
to avoid wasting your CPU and RAM. Don't simply default to
64-bit floating points for everything! Integer operations are
much faster than floating points, and within floating point
types, 32-bit is faster and will use half the RAM/storage!
For more, see *note Numeric data types::.
The example above was just a demo for usage of the ‘index’
operator and some important concepts. But it is not the
easiest way to achieve the desired result above! An easier
way for the scenario above (to keep a circle within an image
and set everything else to NaN) is to use MakeProfiles in
combination with Arithmetic, like below:
$ radius=50
$ centerx=345.2
$ centery=250.3
$ echo "1 $centerx $centery 5 $radius 0 0 1 1 1" \
| astmkprof --background=image.fits \
--mforflatpix --clearcanvas \
-omkprof-mask.fits --type=uint8
$ astarithmetic image.fits mkprof-mask.fits not \
nan where -g1 -omkprof-masked.fits
Tables: adding new columns with row index
Within Table, you can use this operator to add an index column
like below (see the ‘counter’ operator for starting the count
from one).
## The index will be the second column.
$ asttable table.fits -c'arith $1 index'
## The index will be the first column
$ asttable table.fits -c'arith $1 index swap'
‘indexonly’
Similar to ‘index’, except that the top operand is popped from the
stack and is no longer available afterwards.
‘counter’
Similar to ‘index’, except that counting starts from one (not zero
as in ‘index’). Counting from one is usually necessary when adding
row counters in tables, like below:
$ asttable table.fits -c'arith $1 counter swap'
‘counteronly’
Similar to ‘counter’, but the top operand before it is popped (no
longer available).
‘size’
Size of the dataset along a given FITS (or FORTRAN) dimension
(counting from 1). The desired dimension should be the first
popped operand and the dataset must be the second popped operand.
The output will be a single unsigned integer (dimensions cannot be
negative). For example, the following command will produce the
size of the first extension/HDU (the default HDU) of ‘a.fits’ along
the second FITS axis.
$ astarithmetic a.fits 2 size
*Not optimal:* This operator reads the top element on the stack and
then simply reads its size along the given dimension. On a small
dataset this won't consume much RAM, but if you want to put this in
a pipeline or use it on large image, the extra RAM and slow
operation can become meaningful. To avoid such issues, you can
read the size along the given dimension using the ‘--keyvalue’
option of *note Keyword inspection and manipulation::. For
example, in the code below, the X axis position of every pixel is
returned:
$ width=$(astfits image.fits --keyvalue=NAXIS1 -q)
$ astarithmetic image.fits indexonly $width % -opix-x.fits
File: gnuastro.info, Node: Building new dataset and stack management, Next: Operand storage in memory or a file, Prev: Size and position operators, Up: Arithmetic operators
6.2.4.20 Building new dataset and stack management
..................................................
With the operator here, you can create a new dataset from scratch to
start certain operations without any input data.
‘makenew’
Create a new dataset that only has zero values. The number of
dimensions is read as the first popped operand and the number of
elements along each dimension are the next popped operand (in
reverse of the popping order). The type of the new dataset is an
unsigned 8-bit integer and all pixel values have a value of zero.
For example, if you want to create a new 100 by 200 pixel image,
you can run this command:
$ astarithmetic 100 200 2 makenew
To further extend the example, you can use any of the noise-making
operators to add noise to this new dataset (see *note Random number
generators::), like the command below:
$ astarithmetic 100 200 2 makenew 5 mknoise-sigma
‘constant’
Return an operand that will have a constant value (first popped
operand) in all its elements. The number of elements is read from
the second popped operand. The second popped operand is only used
for its number of elements, its numeric data type, or its values
are fully ignored and it is later freed.
Here is one useful scenario for this operator in tables: you want
to merge the objects/rows of some catalogs together, but you first
want to give each source catalog a label/counter that distinguishes
between the source of each rows in the merged/final catalog (using
*note Invoking asttable::). The steps below show the the usage of
this.
## Add label 1 to the RA, Dec, magnitude and magnitude error
## rows of the first catalog.
$ asttable cat-1.fits -cRA,DEC,MAG,MAG_ERR \
-c'arith $1 1 constant' --output=tab-1.fits
## Similar to above, but for the second catalog.
$ asttable cat-2.fits -cRA,DEC,MAG,MAG_ERR \
-c'arith $1 2 constant' --output=tab-2.fits
## Concatenate (merge/blend) the rows of the two tables into
## one for the 5 columns, but also add a counter for each
## object or row in the final catalog.
$ asttable tab-1.fits --catrowfile=tab-2.fits \
-c'arith $1 counteronly' \
-cRA,DEC,MAG,MAG_ERR,5 --output=merged.fits \
--colmetadata=1,ID_MERGED,counter,"Merged ID." \
--colmetadata=6,SOURCE-CAT,counter,"Source ID."
## Add keyword information on each input. It is very important
## to preserve this within the merged catalog. If the tables
## came from public databases (for example on VizieR), give
## their public identifier as the value.
$ astfits merged.fits --write=/,"Source catalogs" \
--write=CATSRC1,"I/355/gaiadr3","VizieR ID." \
--write=CATSRC2,"Jane Doe","Name of source."
## Check the metadata in 'merged.fits' and clean the
## temporary files.
$ rm tab-1.fits tab-2.fits
$ astfits merged.fits -h1
Like most operators, ‘constant’ is not limited to tables, you can
also apply it on images. In the example below, we'll use
‘constant’ to set all the pixels of the input image to NaN (which
is necessary in scenarios that you need to include in an image in
an analysis, but you don't want its pixels to affect the
processing):
$ astarithmetic image.fits nan constant
‘swap’
Swap the top two operands on the stack. For example the ‘index’
operator doesn't pop with the top operand (the input to ‘index’),
it just adds the index image to the stack. In case you want your
next operation to be on the input to ‘index’, you can simply call
‘swap’ and continue the operations on that image, while keeping the
indexed pixels for later steps. In the example below we are using
the ‘--writeall’ option to write the full stack and if you open the
outputs you will see that the stack order has changed.
## Index image is written in HDU 1.
$ astarithmetic image.fits index --writeall \
--output=ind-first.fits
## image.fits in HDU 1.
$ astarithmetic image.fits index swap --writeall \
--output=img-first.fits