DataBase Wars: Using images.
Updated: June 18, 2019

In early June 2019 I began using various software tools that enable me to quickly view parameter spaces associated with sets of images. The point_selector script is a prime exmaple.



Three different collections of images.

I have three sets of images.


fitsfind /home/sco/acm_SBSKY > list.acm                      # 174 images 
fitsfind /media/sco/DataDisk1/sco/AD/EFIGI > list.efigi      # 4459 images 
fitsfind /media/sco/DataDisk1/sco/AD/PFC > list.pfc          # 2022 images   
*** All of these ran on sco2019 almost instantly!!!! 

The SBSKY images are acm images that I have heavily processed (they have WCS and ZP in the header). The PFC images are ccd-processed, but the headers have no formal WCS. The EFIGI images have short headers with WCS but not much else. The galaxy PGV names are encoded in the image names. I have two general concerns with big image sets: can I establish the sky pointing (RA,DEC) and/or can I recognixe the object images by name. For the first case, I made the script (more on this below). Here are the approximate proceessing times.

Usage: wcs_image_centers list.1 N  
arg1 - List of FITS images (that should have WCS) 
arg2 - run in verbose/debug mode (Y/N)

Typical run: 
% wcs_image_centers list.acm N 
The output is one RAsex,DECsex per line in a file anemd:   wcs_image_centers.out

           Number of header cards    Number of iamges    Process Time           Comments 
SBSKY              116                    174             7.0 seconds      Images are on my SSD drive, others on HDD 
PFC                 59                   2022            52.0 seconds      All RA,DEC =  none none  
EFIGI               43                   4459           374.0 seconds 

The processing times are for runs on my hot new machine (sco2019) and so it is clear that these runs on a clunker like mcs would much longer. The reasone for this is the fact that i use a lot of scripts (my own and wcstools) that operate on single images.

Hence, before I even get to the open source vs. personal database code question, I am faced with the script vs. dedicated fortran code issue. here are some things to consider:

  1. The acm headers are huge (especially raw headers). I need a way of handling insanely large headers.
  2. Important information is in the image name. I may want to treat this along with the header.
Considering these points, and viewing the shitty run times above, I decided to experiment with another strightup fortran code.

My list of images is in:    list.ALL    (all 3 of the list.* files above, N=6654 images) 

Usage: survey_headers.sh LIST.IN N 
arg1 - input file listing FITS headers to survey
arg2 - Run in a debug/verbose mode 

What is code does: 
  1) makes a quick file per image of key header cards
  2) locates the DATE card in the FCARDS common block 
  3) pulls the DATE card value 
  4) inspects the DATE card value and create a conforming ISO8601 date string 

% survey_headers.sh list.ALL N

Holy shit, this takes 2 seconds to run.  Finally, I made a test on mcs:

To make a file of many images: 
fitsfind /hetdata/data/20190619/acm >list.1 
fitsfind /hetdata/data/20190618/acm >>list.1 
fitsfind /hetdata/data/20190617/acm >>list.1 

[sco@mcs ~/tmp]$ wc -l list.1
795 list.1

[sco@mcs ~/tmp]$ survey_headers.sh list.1 N 
Time for this run              = 38.000000   (seconds) 

This is the way to go!!!! 





Back to SCO CODES page