Reconstruction Pipeline Script
From Utah Center for Neuroimage Analysis
The Marc lab processing pipeline is a python script which automates construction of the retinal volume using the SCI's NCR Toolset.
Contents |
Extracting images from the JEOL Electron Microscope
The JEOL EM scope uses the .mrc file format to store all tiles it scans when building a mosaic. This .mrc file format is restricted to mosaics no larger than 65,000 (2^16) pixels on a side. This is not large enough to cover a 250uMx250uM patch in a single file using our current magnification factor of around 5,000X. Originally to get around this we collect multiple overlapping .mrc mosaics from the same section and assemble them into a single gigantic mosaic using the SCI tools in a process called "supertiling". David Mastronarde has updated SerialEM to do "SuperMontages". These super montages automatically collect overlapping montages and store the data in a single .MRC file by treating each mosaic as different layer in a stack of montages. Use of the Z-axis gets around the 16 bit size limitation on any one axis.
Once MRC files are captured they should be copied into a directory which will contain the volume. The most sensible way to name a volume is after the block number the sections were cut from. The MRCToVolume.py script uses Mac-only ir-tools and executes all stages of the pipeline to create a registered volume. MRCUnpack.py stops after extracting images but runs on windows. The volume directory is the first argument to both scripts. I typically use MRCUnpack.py on the server downstairs, check for flipped sections, and then complete the volume on the Macs upstairs.
The first step the script performs is to create a directory for each section in a volume. Each MRC file name should start with a number followed by an underscore and any additional naming information (####[_informationstring].mrc). The number indicates which section number from the block this mrc was captured from. A directory is created for each section if it does not already exist. The mrc files are moved into the directory for thier section and extracted.
Extraction begins by reading the mrc header and finding a mapping from stage space to pixel space. This requires that each montage have at least 2x2 tiles. We cannot calculate a difference without two data points. If the montages are too small the script assigns a value based on previous results of images captured at 5000x. Each .mrc file in a section is read and combined into a single file named ####_supertile.mosaic where #### is the section number. This allows capture of a single section to be broken up into multiple sessions which are then combined into a single mosaic based on the reported stage coordinates.
We sample the mrc file to determine the standard deviation and mean of pixel values and scale the 16-bit mrc to an 8-bit mrc by fitting all pixels in a range 2.5 standard deviations from the mean using the “newstack” IMOD command. The individual tiles in the .mrc file measure 4096x4096, but there is an 8 pixel border of garbage. We simultaneously crop these pixels by using the –size 4080,4080 parameter of the “newstack” command. We estimate stage position by finding a Z-level with the largest number of tiles. Measuring from the corners we divide the change in pixel position by the change in stage position to find a scalar value to multiply all stages coordinates in the montage by. This generates the “Supertile.mosaic” file and is the starting point for the SCI tools.
The script then extracts images from the 8-bit MRC file into a subdirectory of the section named “8-bit\001” using the IMOD command “mrc2tif”. The directory name indicates the images are 8-bit files which have a down-sample level of 1 (no downsampling). Image files are named with the section number followed by a period and then the tile number.
Executing the ir-tools
Once tiles are extracted we simultaneously convert images to PNG format for compatability with the Windows viewer and create a pyramid by down-sampling each tile image by a factor of 2. Each level of the pyramid lives in a directory named after the downsample level. We down-sample by powers of 2 to a final level of 64.
Once this is done it is a good idea to examine the section to see whether it was flipped or not. If it was then open or create a file named fliplist.txt. Add the section number to a new line in the file. Delete the extracted images and .mosaics for that section and rerun the extraction script. This will re-extract the MRC files, but flip the images and supertile.mosaic transform.
A big advantage of the pyramid built by the pipeline is the ir-tools do not have to downsample the tiles at execution time. We run each ir-tool using -sh 1 to indicate no-downsampling and override the pixel spacing with the -sp option. The script then changes directories to the proper downsample directory for execution and references any .mosaic files with an absolute path.
Ir-refine-translate is run on supertile.mosaic, which creates translate.mosaic. Ir-refine-grid is run on translate.mosaic to produce grid.mosaic. Once this is done ir-assemble is run using grid.mosaic as input. We create different levels of the assembled image for use with the .stos tools.
Once a single mosaic is assembled for each section they are registered to the section above and below using the slice-to-slice registration tools (stos). Since individual sections may not be perfect the script checks for a tab-delimited text file in the volume directory named stosmap.txt. The first line of the file is ignored and contains a description of each column. The first column is the section to map, the second column is the reference column, the third column is any arguments to pass to the stos commands, and the fourth column contains notes. If the stosmap.txt exists it must contain an entry for each section. If a section does not exist in the file it will not be registered. For an example if sections 4&6 were good, but section 5 was folded we would create these entries to get the information we could from 5 without misregistering section 6.
0005 0004
0006 0004
The script is recursive in nature. Once called it processes all subdirectories that contain .mrc files. It skips targets that have already have been processed. This allows it to be run as a batch job each night where it will only work on new data added to the volume.
Directory Structure
In order for this script to work without constant modification we have to adopt a consistent directory structure for our EM volume data.
This is the directory structure I'm currently writing my scripts against:
/Volume/Section/ImageType/DownsampleLevel/ImageTiles
Example Directory Tree:
This is a guide on what you can expect to find where. Entries in <> are notes for clarity.
-RabbitReconstrutionVolume <Volume directory>
0002-0001_brute_32.stos
0002-0001_grid_32.stos
0001_mosaic_32.png <Assembled mosaic>
0002_mosaic_32.png
+0001 <Section Directory>
+0002 <Section Directory>
-0003 <Section Directory>
0003_supertile.mosaic
translate.mosaic
grid.mosaic
01.mrc <SerialEM output from EM>
02.mrc
03.mrc
+Clahe
+16-bit
-8-bit
-001 <Tiles which are not downsampled>
0003.000.png <Tile>
0003.001.png <Tile>
0003.002.png <Tile>
+002 <Tiles downsampled by a factor of 2>
+004
+008
+016
+032
+064
Rules
.mosaic and .stos files do not store tile file names with any relative or absolute path information. They only store the actual filename of the tile.
MRC file names must begin with a section number. Even if you only collect one section from the block. You can place an underscore ("_") after the number and type additional information. If you are capturing a single section in multiple captures each name must still be unique. For example if I took three captures from section 4 the names could be: 0004_A.mrc, 0004_B.mrc, 0004_C.mrc
File names of any type cannot have spaces.
Tiles are named <Section#>.<Tile#>.png. Ex: The 456th tile from the third section is: 0003.456.png
Mosaics have the naming convention <Section #>_mosaic_<Downsample #>.png . Ex: 0001_mosaic_32.png for section 1's mosaic downsampled by a factor of 32.
.mosaic files live in the section directory. <Section#>_supertile.mosaic is the intial stage position meta-data translate.mosaic is the output of ir-refine-translate grid.mosaic is the output of ir-refine-grid
.stos files live in the volume directory. They are named <Mapped Section #>-<Control Section #>_<brute|grid>_<Downsample #>. Ex: A file generated by ir-stos brute to register section 3 to section 1 using a downsample factor of 16 would be named 0003-0001_brute_16.stos
stosmap.txt: Lives in volume directory. The first line of this text file contains descriptive names for each column. Each line after that contains information for registering one pair of sections. Each column is tab delimited.
flip.txt: Each line in this text file contains the number of a section to flip during extraction from the .mrc file.
Overlap within a montage should be at least 400 pixels, we use 600 pixels (15%). Too little overlap causes the SCI tools to fail.
Overlap between montages in a supermontage should be at least 1000 pixels if not more. The actual overlap appears to be considerably less.
