The problem of how to survey the properties of a directory structure is discussed. I wrote some old stuff, but didn't document for shit:
-rwxrwxrwx 1 sco sco 1827 Oct 5 12:28 disk_survey* # summarizes a dir tree -rwxrwxrwx 1 sco sco 1913 Oct 3 2017 disk_archiver* # prepares tar files -rwxrwxrwx 1 sco sco 576 Oct 3 2017 disk_list_locdirs* -rwxrwxrwx 1 sco sco 1228 Oct 2 2017 disk_untar* -rwxrwxrwx 1 sco sco 1228 Oct 2 2017 disk_untar* -rwxrwxrwx 1 sco sco 1364 Oct 1 2017 disk_locdir_tar* -rwxrwxrwx 1 sco sco 1364 Oct 1 2017 disk_locdir_tar* I wrote some old, sucky notes: Notes_sco/README.disk_survey
This section was motivated by some frustration arising from reading the man pages for the unix utilities "ls" and "du". Both of these tools are very useful, but getting a clear understanding of just waht units the file sizes are being expressed in is troublesome.
I use a FITS image that I understand is a not a multi-extension format. % cp $critfiles/tdat/n4625_pfcB.fits . Then I proceed to estimate the image size in bytes, and then measure it with l"ls" and "du". % imhead n4625_pfcB.fits > a % linecnt a ----> 30 lines Top of the header SIMPLE = T / file conforms to FITS standard? BITPIX = -32 / number of bits per data pixel # 4-byte float NAXIS = 2 / number of data axes NAXIS1 = 320 / length of data axis 1 NAXIS2 = 270 / length of data axis 2 1 block = 2880 bytes Header: 30 lines x 80char/line = 2400 chars = 0.833 blocks = 1 block = 2880 bytes Pixel data (4 bytes per pixel): 320 x 270 = 86400 pixels = 345600 bytes = 120 blocks = 345600 bytes Total image size = 2880 + 345600 = 348480 bytes This is EXACTLY the "ls" result: % ls -lt n4625_pfcB.fits -rw-r--r-- 1 sco sco 348480 Oct 5 13:46 n4625_pfcB.fits Here are some other examples of using "ls" # Print the size in bytes (like the default above) % ls -l --block-size=1 n4625_pfcB.fits -rw-r--r-- 1 sco sco 348480 Oct 5 13:46 n4625_pfcB.fits # Print the size in kilobytes % ls -l --block-size=K n4625_pfcB.fits -rw-r--r-- 1 sco sco 341K Oct 5 13:46 n4625_pfcB.fits # Print the size in megabytes % ls -l --block-size=M n4625_pfcB.fits -rw-r--r-- 1 sco sco 1M Oct 5 13:46 n4625_pfcB.fitsHere we see the rub: specifying size units beyond bytes results in ONLY APPROXIMATE answers. Return to top of page.
I can use the FITS file from that last section as a test file now that I believe the size of that file (n4625_pfcB.fits is 348480 bytes in size). I use the build_test_dir to make copies of this file into a local subdirectory. If I made 3 copies of this file I should have collected a total of 1045440 bytes.
---------------------------------------------------------- Here is a typical run: % build_test_dir A1 $critfils/tdat/n4625_pfcB.fits 3 N A1 is already present. Size of test file in bytes = 348480 Number of files written = 3 Total number of bytes written = 1045440 ---------------------------------------------------------- I can also find the size (in bytes) of the new ./A1 subdirectory: % du -a -b A1 348480 A1/AA3 348480 A1/AA2 348480 A1/AA1 1049536 A1 *** Turns out the du,ls routines also count the ./ file as an extra 4096 bytes % ls -alt ./A1 total 1040 drwxr-xr-x 4 sco sco 4096 Oct 5 21:50 ../ -rw-r--r-- 1 sco sco 348480 Oct 5 21:44 AA1 -rw-r--r-- 1 sco sco 348480 Oct 5 21:44 AA2 -rw-r--r-- 1 sco sco 348480 Oct 5 21:44 AA3 drwxr-xr-x 2 sco sco 4096 Oct 5 21:44 ./ % du -a -b A1 348480 A1/AA3 348480 A1/AA2 348480 A1/AA1 1049536 A1Note that both of the above procedures agree with my predicted result of 1049536 bytes if I copy n4625_pfcB.fits three times. Return to top of page.
The disk_survey routine can be used to summarize in a simple way the size and structure of a directory tree. It takes as input a list of the subdirectories in you local working directory that you want to survey. You can use the routine disk_list_locdirs to compile such a list.
% disk_survey Usage: disk_survey list.1 Y arg1 - file list directories to survey arg2 - debug flag (Y/N) % ls A1/ A2/ list.1 % cat list.1 A1 A2 % disk_survey list.1 N 1049536 3 1 A1 1049536 3 1 A2 % ls A1/ A2/ disk_survey.out junk.fp list.1 List.dirs List.files pars.in % cat disk_survey.out 1049536 1.050 3 1 A1 1049536 1.050 3 1 A2Need to clean this up and make a better outout! Return to top of page.