Disk usage and data backup methods
Last updated: Oct6,2019

The problem of how to survey the properties of a directory structure is discussed. I wrote some old stuff, but didn't document for shit:


-rwxrwxrwx 1 sco sco  1827 Oct  5 12:28 disk_survey*            # summarizes a dir tree 
-rwxrwxrwx 1 sco sco  1913 Oct  3  2017 disk_archiver*          # prepares tar files 
-rwxrwxrwx 1 sco sco   576 Oct  3  2017 disk_list_locdirs*
-rwxrwxrwx 1 sco sco  1228 Oct  2  2017 disk_untar*
-rwxrwxrwx 1 sco sco  1228 Oct  2  2017 disk_untar*
-rwxrwxrwx 1 sco sco  1364 Oct  1  2017 disk_locdir_tar*
-rwxrwxrwx 1 sco sco  1364 Oct  1  2017 disk_locdir_tar*

I wrote some old, sucky notes: 
Notes_sco/README.disk_survey

  1. What is the size of a single FITS image?
  2. How can I build a test directory tree?
  3. How do I use disk_survey?



What is the size of a single FITS image?

This section was motivated by some frustration arising from reading the man pages for the unix utilities "ls" and "du". Both of these tools are very useful, but getting a clear understanding of just waht units the file sizes are being expressed in is troublesome.


I use a FITS image that I understand is a not a multi-extension format. 
% cp $critfiles/tdat/n4625_pfcB.fits .  

Then I proceed to estimate the image size in bytes, and then measure it with 
l"ls" and "du". 

  % imhead n4625_pfcB.fits > a
  % linecnt a  ---->  30 lines  

Top of the header     
SIMPLE  =                    T  / file conforms to FITS standard?
BITPIX  =                  -32  / number of bits per data pixel         # 4-byte float 
NAXIS   =                    2  / number of data axes
NAXIS1  =                  320  / length of data axis   1
NAXIS2  =                  270  / length of data axis   2

1 block = 2880 bytes 

Header:    
  30 lines x 80char/line = 2400 chars = 0.833 blocks = 1 block = 2880 bytes 

Pixel data (4 bytes per pixel): 
  320 x 270 = 86400 pixels =  345600 bytes = 120 blocks =         345600 bytes

Total image size = 2880 + 345600  = 348480 bytes 

This is EXACTLY the "ls" result: 
% ls -lt n4625_pfcB.fits
-rw-r--r-- 1 sco sco 348480 Oct  5 13:46 n4625_pfcB.fits             

Here are some other examples of using  "ls" 
# Print the size in bytes (like the default above) 
% ls -l --block-size=1 n4625_pfcB.fits
-rw-r--r-- 1 sco sco 348480 Oct  5 13:46 n4625_pfcB.fits

# Print the size in kilobytes 
% ls -l --block-size=K n4625_pfcB.fits
-rw-r--r-- 1 sco sco 341K Oct  5 13:46 n4625_pfcB.fits

# Print the size in megabytes 
% ls -l --block-size=M n4625_pfcB.fits
-rw-r--r-- 1 sco sco 1M Oct  5 13:46 n4625_pfcB.fits

 
Here we see the rub: specifying size units beyond bytes results in ONLY APPROXIMATE answers.

Return to top of page.



How can I build a test directory tree?

I can use the FITS file from that last section as a test file now that I believe the size of that file (n4625_pfcB.fits is 348480 bytes in size). I use the build_test_dir to make copies of this file into a local subdirectory. If I made 3 copies of this file I should have collected a total of 1045440 bytes.



----------------------------------------------------------
Here is a typical run: 
% build_test_dir A1 $critfils/tdat/n4625_pfcB.fits 3 N  
A1 is already present.

Size of test file in bytes    = 348480 
Number of files written       = 3 
Total number of bytes written = 1045440 
----------------------------------------------------------

I can also find the size (in bytes) of the new ./A1 subdirectory: 
% du -a -b A1 
348480	A1/AA3
348480	A1/AA2
348480	A1/AA1
1049536	A1

*** Turns out the du,ls routines also count the ./ file as an extra 4096 bytes
% ls -alt ./A1 
total 1040
drwxr-xr-x 4 sco sco   4096 Oct  5 21:50 ../
-rw-r--r-- 1 sco sco 348480 Oct  5 21:44 AA1
-rw-r--r-- 1 sco sco 348480 Oct  5 21:44 AA2
-rw-r--r-- 1 sco sco 348480 Oct  5 21:44 AA3
drwxr-xr-x 2 sco sco   4096 Oct  5 21:44 ./

%  du -a -b A1 
348480	A1/AA3
348480	A1/AA2
348480	A1/AA1
1049536	A1
 
Note that both of the above procedures agree with my predicted result of 1049536 bytes if I copy n4625_pfcB.fits three times.

Return to top of page.



How do I use disk_survey?

The disk_survey routine can be used to summarize in a simple way the size and structure of a directory tree. It takes as input a list of the subdirectories in you local working directory that you want to survey. You can use the routine disk_list_locdirs to compile such a list.


% disk_survey
Usage: disk_survey list.1 Y 
arg1 - file list directories to survey
arg2 - debug flag (Y/N) 
% ls
A1/  A2/  list.1
% cat list.1
A1
A2

% disk_survey list.1 N 
1049536 	3 	1 	A1 
1049536 	3 	1 	A2 

% ls
A1/  A2/  disk_survey.out  junk.fp  list.1  List.dirs  List.files  pars.in

% cat disk_survey.out
        1049536         1.050          3       1   A1                            
        1049536         1.050          3       1   A2           

 
Need to clean this up and make a better outout!

Return to top of page.



Back to SCO code page