Monday, July 21, 2014

An introduction to quantitative astrophotography on Linux: Part 2 - RAW to FITS and Dark Frames

In part 1 of this series I introduced my reasons for wanting to look into what I called quantitative astrophotography, by which I mean getting god quantitative estimates of the performance of a very simple, entry level, set of DSLR images of the night sky. I wanted to control and understand the quantitative changes made to the images, and hence chose to avoid the Windows-based GUI software common in the amateur astronomy world. Instead I'll use standard astronomical data reduction and analysis techniques and stick to open source tools running under Linux (actually in this case, OSX with macports).

By the end of part 1 I was in possession of ~30 RAW images taken with a Canon Digital Rebel XTi, including a series of dark frames and bias frames. This article covers the next step in the process: converting the RAW images into FITS format, along with a quick look at the dark frame properties. The process of finding the bad pixels and removing them from the images will be covered in Part 3.

Dark frame and bad pixel data reduction flow chart

The flow chart summarizes the actions that need to be performed on the RAW images to end up with FITS images that have had the bad pixels, the dark current and the bias removed.

The basic idea behind astronomical imaging is to use a series of calibration images to improve the final image of the target. At these early stages of reduction we're talking about using dark frames, bias frames, and bad pixel masks to remove the additional signal in the image that is not from the sky, but from the detector electronics itself. Originally I wrote a lengthy description of each of these calibration image types, but I know think it is better just to point interested readers at more extensive discussions at:
  1. Starizona's Guide to CCD Imaging.
  2. Mischa's Data Reduction web page at the University of Bonn.

The other reason I decided to cut down on the terminology discussion is that is very CCD-centric, as they are the classic detector type used in professional astronomy. However digital cameras almost always use CMOS detectors and not classic CCDs. The on-Camera data processors inside digital camera perform operations that alter the data values in ways that professional CCD imagers do not.

dcraw to FITS.

To create a FITS file from a RAW file we use the -c flag on the dcraw command line and pipe the output into a program that can convert PNM format to FITS. The dcraw web page recommends pnmtofits, e.g.
dcraw -c (other_flags) input_image.cr2 | pnmtofits > output.fits

However there is one problem with this. dcraw and many image processing utilties assume the first pixel is the top left of an image, whereas the FITS conventsion is the first pixel is the bottom left. So FITS files created with pnmtofits are upside-down. ImageMagick is smart enough to know about fits and corrects for this, so we can make correctly-oriented FITS files like this:
dcraw -c (other_flags) input_image.cr2 | convert pnm:- output.fits

Dark and/or Bias Frame Creation

The dark and bias frames are not exposed to light, so the Bayer RGB mask on top of the detector is not important and the signal doesn't depend on whether each pixel is R, G or B. This means we convert the RAW file to a single channel (grayscale) image for analysis.

It is important that both the raw count output from the camera's ADU and image orientation are preserved, i.e. no gamma correction or histogram manipulation should be performed on the raw file. We want the exact 12-bit (or 14-bit if you have a newer camera) values [except that we cannot control what the Camera's image processor does to the RAW images].

The corresponding dcraw command line options are -D -4 -j -t 0. Hence for each raw format dark and/or bias frame we initially run the following command to convert to FITS:
dcraw -c -D -4 -j -t 0 input_dark.cr2 | convert pnm:- output_dark.fits

Dark and Bias Frame Statistics

The following table summarizes some simple statistical properties of all the dark and bias frames I took. The file name also contains the ISO value (400, 800 or 1600) and whether bad pixel had been removed in the dcraw to FITS conversion, where "nobadpix" means "no bad pixel removal". We'll deal with the bad pixel identification and removal in the next post.

FileRowsColsMean +/- StdDevMedianMinimumMaximum
Prior to bad pixel removal
dark_0694_t30.0_800_Sat_Jan_18_22:04:46_2014_nobadpix.fits26023906256.11 +/- 6.39256.50170.54056.5
dark_0695_t30.0_800_Sat_Jan_18_22:05:41_2014_nobadpix.fits26023906256.11 +/- 6.34256.50177.54056.5
dark_0696_t30.0_1600_Sat_Jan_18_22:06:22_2014_nobadpix.fits26023906256.42 +/- 10.89257.5095.54057.5
dark_0697_t30.0_1600_Sat_Jan_18_22:06:57_2014_nobadpix.fits26023906256.38 +/- 10.84257.50105.54057.5
dark_0698_t30.0_400_Sat_Jan_18_22:07:40_2014_nobadpix.fits26023906255.99 +/- 4.03255.50219.54055.5
dark_0699_t30.0_400_Sat_Jan_18_22:08:14_2014_nobadpix.fits26023906256.00 +/- 4.02255.50218.54055.5
bias_0700_t0.00025_400_Sat_Jan_18_22:09:03_2014_nobadpix.fits26023906255.99 +/- 2.95255.50210.5333.5
bias_0701_t0.00025_400_Sat_Jan_18_22:09:09_2014_nobadpix.fits26023906256.00 +/- 2.94255.50216.5328.5
bias_0702_t0.00025_800_Sat_Jan_18_22:09:21_2014_nobadpix.fits26023906256.09 +/- 4.98256.50175.5397.5
bias_0703_t0.00025_800_Sat_Jan_18_22:09:23_2014_nobadpix.fits26023906256.09 +/- 4.98256.50185.5394.5
bias_0704_t0.00025_1600_Sat_Jan_18_22:09:31_2014_nobadpix.fits26023906256.39 +/- 9.27257.50101.5531.5
bias_0705_t0.00025_1600_Sat_Jan_18_22:09:32_2014_nobadpix.fits26023906256.39 +/- 9.27257.5099.5534.5
Bad pixels removed
dark_0694_t30.0_800_Sat_Jan_18_22:04:46_2014.fits26023906256.10 +/- 5.04256.50193.5320.5
dark_0695_t30.0_800_Sat_Jan_18_22:05:41_2014.fits26023906256.10 +/- 5.04256.50177.5693.5
dark_0696_t30.0_1600_Sat_Jan_18_22:06:22_2014.fits26023906256.40 +/- 9.41257.5095.51047.5
dark_0697_t30.0_1600_Sat_Jan_18_22:06:57_2014.fits26023906256.36 +/- 9.39257.50105.5670.5
dark_0698_t30.0_400_Sat_Jan_18_22:07:40_2014.fits26023906255.99 +/- 2.96255.50219.5451.5
dark_0699_t30.0_400_Sat_Jan_18_22:08:14_2014.fits26023906255.99 +/- 2.95255.50218.5316.5
bias_0700_t0.00025_400_Sat_Jan_18_22:09:03_2014.fits26023906255.99 +/- 2.95255.50210.5307.5
bias_0701_t0.00025_400_Sat_Jan_18_22:09:09_2014.fits26023906256.00 +/- 2.94255.50216.5299.5
bias_0702_t0.00025_800_Sat_Jan_18_22:09:21_2014.fits26023906256.09 +/- 4.98256.50175.5341.5
bias_0703_t0.00025_800_Sat_Jan_18_22:09:23_2014.fits26023906256.09 +/- 4.98256.50185.5340.5
bias_0704_t0.00025_1600_Sat_Jan_18_22:09:31_2014.fits26023906256.39 +/- 9.27257.50101.5418.5
bias_0705_t0.00025_1600_Sat_Jan_18_22:09:32_2014.fits26023906256.39 +/- 9.27257.50102.5434.5

We can immediately see that the RAW values have been altered by the on-camera image processing chip.
  1.  The mean and median values of all the dark frame and bias frames are essentially 256+/-1, irrespective of whether it is a 30 second dark frame or a 1/4000'th of a second bias exposure, and irrespective of the ISO value or whether bad pixels have been removed. That value of 256 is 2^8, which would be highly suggestive of a camera-processor induced value even if we didn't already suspect some on-board processing of the RAW frames.
  2. The standard deviations of the mean don't relate the mean values. 
In the next post in this series we will look at bad pixel detection and removal using dcraw.

Friday, July 11, 2014

How much does a square root cost?

An interesting diversion: how much (time) does calling the square root function sqrt() cost, and is it even worth trying alternative less-accurate implementations?

Some Assembly Required provides some answers: 

Method Total time Time per float Avg Error
Compiler sqrt(x) /
404.029ms 24ns 0.0000%
SSE intrinsic ssqrts 200.395ms 11.9ns 0.0000%
Carmack’s Magic Number rsqrt * x 72.682ms 4.33ns 0.0990%
SSE rsqrtss * x 20.495ms 1.22ns 0.0094%
SSE rsqrtss * x
with one NR step
53.401ms 3.18ns 0.0000%
SSE rsqrtss * x
with one NR step, unrolled by four
48.701ms 2.90ns 0.0000%
Method Total time Time per float Avg Error
Carmack’s Magic Number rsqrt 59.378ms 3.54ns 0.0990%
SSE rsqrtss 14.202ms 0.85ns 0.0094%
SSE rsqrtss
with one NR step
45.952ms 2.74ns 0.0000%

Monday, June 23, 2014

Useful OSX-specific command line tools

Life-hacker has a short list of uses OS X command line programs that aren't just standard GNU/*nix commands:

This mentions homebrew, which I hadn't heard of before. This answer to a stack overflow question about what are the differences between homebrew and macports gives some info (install location, user space vs sudo use, number of packages, support for multiple versions etc) but may not be enough to allow a definite decision on which to use without some experimentation. For science it appears currently homebrew has some but not all of the standard packages I'd expect.

Thursday, June 05, 2014

The (Forever) Language War

I generally find comparisons between programming languages to be a waste of time given they're largely based on personal preference over objective metrics, and even metric-based comparisons are often superficial or downright misleading (yeah, I'm talking about you Julia). Generally my view is use what best fits the application modulo your expertise, and the language doesn't really matter - it is an implementation choice.

That said, I've lately being feeling rather down about C++ which is the language I develop in most heavily. Sure, it has speed and expressive power but some of the most powerful features are semantically non-trivial and most galling of all, it really doesn't come with many standard "packages" in the way Java and Python do. OK, it comes with the STL, which I do like, but then you have to go to Boost, and then you look at how some of the Boost packages are coded and you wonder if it is even the same language. In a very large and complicated project I find the developer (or more technically, integration troubleshooter and debugger) role to have low productivity. It isn't as bad as perl but it is easy for "other" programmers to write code that is just overly complicated for the job. Oh great, someone has learnt about the Abstract Factory pattern, and has decided to use it everywhere: attempt to drill down through layers of unnecessary obfuscatory abstraction (that manages to be coupled to a million other things instead of decoupling the way abstraction should do) to find what some code is actually doing. Analyze, debug, code, compile ... ... ... and run again, all while you have a supposedly critical release deadline looming.

It is easy to forget the problem what C++ set out to fix, which I was reminded of when I starting reading "Learning OpenCV". Now OpenCV is a wonderfully powerful open source library in my opinion, based on mainly using it before now through its Python bindings. But reading how to use the C API was eye opening. What, I have to remember to delete those resources? Oh great, void* pointers and I have to remember what type to cast to/from. I could just imagine the fun bugs someone had to deal with. Suddenly C++ didn't seem so doddering.

While reading about Apple's new Swift language at (Another new proprietary language to tie developers to Apple and in darkness bind them, how exciting for the Mac fan boys!) I came across a link to a discussion by David Green comparing his experience trying out both iOS and Android development tools. The comments are the article reveal the pattern that makes me dislike language comparisons: In a discussion of X and Y you must be biased toward X because you know more about it X so really X and Y are equally good. (There should be a name for this fallacy but I can't seem to find it). However the article itself is worth a look, because it compares the iOS+Xcode+Objective-C platform to Android+Eclipse+Java platform in terms of ease of development. I've always been somewhat curious about Xcode and Objective-C, but David Green's article is enough to stop me wasting my time. What the commentators on his article don't understand is that while you can do great stuff with Xcode and Objective-C it has placed extra hurdles in your way: the extra verbosity needed in the language, the extra time it'll take to hunt down solutions to non-trivial problems. That time might not matter to some developers, in particular to Apple-only developers, but it will to matter to someone and it certainly matters to me.