Pure Python tools for reading and writing all TIFF IFDs, sub-IFDs, and tags.

Overview

Tiff Tools Build Status codecov.io License

Pure Python tools for reading and writing all TIFF IFDs, sub-IFDs, and tags.

Developed by Kitware, Inc. with funding from The National Cancer Institute.

Example

import tifftools
info = tifftools.read_tiff('photograph.tif')
info['ifds'][0]['tags'][tifftools.Tag.ImageDescription.value] = {
    'data': 'A dog digging.',
    'datatype': tifftools.Datatype.ASCII
}
exififd = info['ifds'][0]['tags'][tifftools.Tag.EXIFIFD.value]['ifds'][0]
exififd['tags'][tifftools.constants.EXIFTag.FNumber.value] = {
    'data': [54, 10],
    'datatype': tifftools.Datatype.RATIONAL
}
tifftools.write_tiff(info, 'photograph_tagged.tif')

Commands

tifftools --help and tifftools --help provide usage details.

  • tifftools split [--subifds] [--overwrite] source [prefix]: split a tiff file into separate files. This is also available as the library function tifftools.tiff_split.
  • tifftools concat [--overwrite] source [source ...] output: merge multiple tiff files together. Alias: tifftools merge. This is also available as the library function tifftools.tiff_concat.
  • tifftools dump [--max MAX] [--json] source [source ...]: print information about a tiff file, including all tags, IFDs, and subIFDs. Alias: tifftool info. This is also available as the library function tifftools.tiff_dump.
  • tifftools set source [--overwrite] [output] [--set TAG[:DATATYPE][, ] VALUE] [--unset TAG:[, ]] [--setfrom TAG[, ] TIFFPATH] : modify, add, or remove tags. This is also available as the library function tifftools.tiff_set.

Library Functions

  • read_tiff
  • write_tiff
  • Constants
  • Tag
  • Datatype
  • get_or_create_tag
  • EXIFTag, GPSTag, etc.

Installation

tifftools is available on PyPI and conda-forge.

To install with pip from PyPI:

pip install tifftools

To install with conda:

conda install -c conda-forge tifftools

Purpose

tifftools provides a library and a command line program for maniplulating TIFF files. It can split multiple images apart, merge images together, set any tag in any IFD, and dump all IFDs and tags in a single command. It only uses python standard library modules, and is therefore widely compatible.

Rationale

There was a need to combine images from multiple TIFF files without altering the image data or losing any tag information. Further, when changing tag values, it was essential that the old values were fully removed from the output.

The command line tools associated with libtiff are commonly used for similar purposes. The libtiff command tools have significant limitations: tiffdump and tiffinfo require multiple commands to see information from all IFDs. tiffset does not remove data from a file; rather it appends to the file to only reference new data, leaving the old values inside the file. tiffsplit doesn't keep tags it doesn't recognize, loosing data. tiffcp always reencodes images and will fail for compression types it does not know.

Likewise, there is a wide variety of EXIF tools. For the most part, these only alter tags, usually by appending to the existing file. ImageMagick's convert command also recompresses images as it combines them.

Many programs deal with both classic and BigTIFF. Some will start writing a classic TIFF, but leave a small amount of unused space just after the file header. If the file exceeds 4Gb, parts of the file are rewritten to convert it to a BigTIFF file, leaving small amounts of abandoned data within the file.

tifftools fills this need. All tags are copied, even if unknown. Files are always rewritten so that there is never abandoned data inside the file. tifftools dump shows information on all IFDs and tags. Many of the command line options are directly inspired from libtiff.

tifftools does NOT compress or decompress any image data. This is not an image viewer. If you need to recompress an image or otherwise manipulate pixel data, use libtiff or another library.

As an explicit example, with libtiff's tiffset, tag data just gets dereferenced and is still in the file:

$ grep 'secret' photograph.tif  || echo 'not present'
not present
$ tiffset -s ImageDescription "secret phrase" photograph.tif
$ tiffinfo photograph.tif | grep ImageDescription
  ImageDescription: secret phrase
$ grep 'secret' photograph.tif  || echo 'not present'
Binary file photograph.tif matches
$ tiffset photograph.tif -s ImageDescription "public phrase"
$ tiffinfo photograph.tif | grep ImageDescription
  ImageDescription: public phrase
$ grep 'secret' photograph.tif  || echo 'not present'
Binary file photograph.tif matches

Whereas, with tifftools:

$ grep 'secret' photograph.tif || echo 'not present'
not present
$ tifftools set -y -s ImageDescription "secret phrase" photograph.tif
$ tiffinfo photograph.tif | grep ImageDescription
  ImageDescription: secret phrase
$ grep 'secret' photograph.tif || echo 'not present'
Binary file photograph.tif matches
$ tifftools set -y photograph.tif -s ImageDescription "public phrase"
$ tiffinfo photograph.tif | grep ImageDescription
  ImageDescription: public phrase $ grep 'secret' photograph.tif || echo
  'not present' not present

TIFF File Structure

TIFF Files consist of one or more IFDs (Image File Directories). These can be located anywhere within the file, and are referenced by their absolute position within the file. IFDs can refer to image data; they can also contain a collection of metadata (for instance, EXIF or GPS data). Small data values are stored directly in the IFD. Bigger data values (such as image data, longer strings, or lists of numbers) are referenced by the IFD and are stored elsewhere in the file.

In the simple case, a TIFF file may have a list of IFDs, each one referencing the next. However, a complex TIFF file, such as those used by some Whole-Slide Image (WSI) microscopy systems, can have IFDs organized in a branching structure, where some IFDs are in a list and some reference SubIFDs with additional images.

TIFF files can have their primary data stored in either little-endian or big-endian format. Offsets to data are stored as absolute numbers inside a TIFF file. There are two variations: "classic" and "BigTIFF" which use 32-bits and 64-bits for these offsets, respectively. If the file size exceeds 4 Gb or uses 64-bit integer datatypes, it must be written as a BigTIFF.

Limitations

Unknown tags that are offsets and have a datatype other than IFD or IFD8 won't be copied properly, as it is impossible to distinguish integer data from offsets given LONG or LONG8 datatypes. This can be remedied by defining a new TiffConstant record which contains a bytecounts entry to instruct whether the offsets refer to fixed length data or should get the length of data from another tag.

Because files are ALWAYS rewritten, tifftools is slower than libtiff's tiffset and most EXIF tools.

Comments
  • TIFF IFD concatenations/removals output images that fail JHOVE validation check for value offset word-alignment

    TIFF IFD concatenations/removals output images that fail JHOVE validation check for value offset word-alignment

    (https://jhove.openpreservation.org/modules/tiff/)

    > python -c 'import tifftools;tifftools.tiff_concat(["good1.svs", "good2.svs"], "out.svs", overwrite=True)'
    > jhove -m TIFF-hul out.svs
    
    Jhove (Rel. 1.24.1, 2020-03-16)
     Date: 2021-07-15 20:14:30 MDT
     RepresentationInformation: out.svs
      ReportingModule: TIFF-hul, Rel. 1.9.2 (2019-12-10)
      LastModified: 2021-07-15 20:14:20 MDT
      Size: 875685820
      Format: TIFF
      Status: Not well-formed
      SignatureMatches:
       TIFF-hul
      ErrorMessage: Value offset not word-aligned: 8842289
       ID: TIFF-HUL-4
       Offset: 8858224
      MIMEtype: image/tiff
    

    https://web.archive.org/web/20160324105748/https://partners.adobe.com/public/developer/en/tiff/TIFF6.pdf on page 15 (about IFD entries) says:

    Bytes 8-11 The Value Offset, the file offset (in bytes) of the Value for the field. The Value is expected to begin on a word boundary; the correspond- ing Value Offset will thus be an even number. This file offset may point anywhere in the file, even after the image data.

    opened by fiendish 4
  • Appending a new tag to an existing TIFF image

    Appending a new tag to an existing TIFF image

    Hi,

    I am trying to add a new tag to a TIF file using Python3.

    I have worked through your code and I am now able to understand the TIFF structure.

    However, I need to add a a new Tag. My code is an attempt to adapt your example for a similar action from the command line:

    info = tifftools.read_tiff(Fpath) info[info, 'ifds'][0]['tags'][tifftools.Tag.ImageDescription.value] = { 'data': 'A dog digging.', 'datatype': tifftools.Datatype.ASCII }

    tifftools.write_tiff(Fpath, OutPath, info)

    I am not the strongest Python programmer, but I am baffled why I cannot update the TIFF TAG dict structure. Can you guide me please?

    regards

    Phil

    opened by pfculverhouse 3
  • Readme example seems to be incorrect

    Readme example seems to be incorrect

    Running the sample code from the readme file gives:

    $ python3 sample.py
    Traceback (most recent call last):
      File "sample2.py", line 8, in <module>
        exififd['tags'][tifftools.constants.EXIFTag.FNumber.value] = {
    TypeError: list indices must be integers or slices, not str
    

    it seems that we are missing level for exifs (SubIDF?).

    The following does works:

    $ diff -u sample.py.orig sample.py
    --- sample.py.orig      2022-02-28 18:19:35.000000000 +0100
    +++ sample.py   2022-02-28 18:19:52.000000000 +0100
    @@ -4,7 +4,7 @@
         'data': 'A dog digging.',
         'datatype': tifftools.Datatype.ASCII
     }
    -exififd = info['ifds'][0]['tags'][tifftools.Tag.EXIFIFD.value]['ifds'][0]
    +exififd = info['ifds'][0]['tags'][tifftools.Tag.EXIFIFD.value]['ifds'][0][0]
     exififd['tags'][tifftools.constants.EXIFTag.FNumber.value] = {
         'data': [54, 10],
         'datatype': tifftools.Datatype.RATIONAL
    $ python3 sample.py
    $ tifftools dump photograph_tagged.tif | grep FNumber
          FNumber 33437 (0x829D) RATIONAL: 54 10 (5.4)
    

    Side note: the above code will fail if the TIFF file doesn't already contains EXIFs. Is the following snippet the correct way to add EXIF IFD?

    try:
        exif = info["ifds"][0]["tags"][tifftools.Tag.EXIFIFD.value]
    except KeyError:
        exif = {
            "datatype": tifftools.Datatype.IFD,
            "ifds": [[{"tags": {}, "path_or_fobj": info["ifds"][0]["path_or_fobj"]}]],
        }
        info["ifds"][0]["tags"][tifftools.Tag.EXIFIFD.value] = exif
    exififd = exif["ifds"][0][0]
    # add tags herunder
    
    opened by AmedeeBulle 2
  • Examples of write tiffs from scratch, not inheriting info from an existing tiff

    Examples of write tiffs from scratch, not inheriting info from an existing tiff

    Please could you provide more examples for how to use this tool to write tiffs from scratch? Thanks SO much for providing this tool. I really hope I can make it work for me!

    Problem:

    • I would like to write out numpy arrays to bigtiff with custom metadata AND exif data
    • I'm getting exif data from a jpeg, then generating outputs from that jpeg within a script, that I want to write those outputs to a tiff with exif and metadata

    Alternatives tried:

    • I cant use tifffile because it doesnt deal with exif data
    • I cant use PIL because it doesnt write exif data
    • Rasterio doesnt seem to have an exif option
    • I cant follow the provided example because I dont have a tiff to read as a starting point

    So I'm really hoping I can use tifftools, but I cant figure out (from the one provided example) how to write a tiff entirely from scratch using arrays

    Question: The provided example is useful, but only deals with the case where all the info is obtained from an existing file

    info = tifftools.read_tiff(file)

    But how would you create that info object from an numpy array, metadata, and exif separately? where metadata and exif are two separate dicts or JSON objects (or whatever)?

    I also found this that helpfully explains how to add info to an existing tiff

    In summary, I'm looking for advice for how to construct a viable info object that can be passed to .write_tif like this

    tifftools.write_tif(info, ...)

    where info is constructed from 1) a numpy array containing the image, 2) an array or dictionary of exif data, and 3) an array or dictionary of other metadata (tags)

    opened by dbuscombe-usgs 2
  • Make it easier to add new ifds by not requiring path_or_fobj.

    Make it easier to add new ifds by not requiring path_or_fobj.

    The path_or_fobj internal value is only required for ifds if they need to transfer data from an existing tiff file. If tags are entirely self-contained, this is no longer required to be set.

    opened by manthey 0
  • Better handle NDPI files

    Better handle NDPI files

    NDPI files aren't quite valid tiff files. Rather, they are marked as non-bigtiff, but use 64-bit values for ifd offsets and have some implied upper bits for some data offsets. If a file is larger than 4 Gb or of unknown length, and a NDPI-specific tag is encountered that could have otherwise invalid offsets, read additional IFD offsets as 64-bit values and adjust data offset values according to the NDPI methods as illustrated by the OpenSlide library.

    opened by manthey 0
  • Output values on word boundaries.

    Output values on word boundaries.

    Better handle saving to small tiff. Before, once written to a bigtiff, it was unlikely to convert back to a small tiff since some fields written as LONG8 didn't automatically convert to LONG.

    opened by manthey 0
  • More often generate small tiff

    More often generate small tiff

    Better handle saving to small tiff. Before, once written to a bigtiff, it was unlikely to convert back to a small tiff since some fields written as LONG8 didn't automatically convert to LONG.

    opened by manthey 0
Releases(v1.3.6)
Owner
Digital Slide Archive
Tools for the management, visualization, and analysis of digital pathology data.
Digital Slide Archive
A Certificate renaming tool made for IEEE CS SBC, SJCE.

PDF Batch Renamer Made for IEEE CS SBC, SJCE How to use? Before using the python script, ensure that pytesseract, pdf2image, opencv and other supporti

Ashwin Kumar U 2 Nov 14, 2021
A tool for batch processing large fasta files and accompanying metadata table to upload to repositories via API

Fasta Uploader A tool for batch processing large fasta files and accompanying metadata table to repositories via API The python fasta_uploader.py scri

Centre for Infectious Disease and One Health 1 Dec 09, 2021
gitfs is a FUSE file system that fully integrates with git - Version controlled file system

gitfs is a FUSE file system that fully integrates with git. You can mount a remote repository's branch locally, and any subsequent changes made to the files will be automatically committed to the rem

Presslabs 2.3k Jan 08, 2023
Add Ranges and page numbers to IIIF Manifest from a CSV.

Add Ranges and page numbers to IIIF Manifest from CSV specific to a workflow of the Bibliotheca Hertziana.

Raffaele Viglianti 3 Apr 28, 2022
This project is a set of programs that I use to create a README.md file.

🤖 codex-readme 📜 codex-readme What is it? This project is a set of programs that I use to create a README.md file. How does it work? It reads progra

Tom Dörr 224 Jan 07, 2023
LightCSV - This CSV reader is implemented in just pure Python.

LightCSV Simple light CSV reader This CSV reader is implemented in just pure Python. It allows to specify a separator, a quote char and column titles

Jose Rodriguez 6 Mar 05, 2022
dotsend is a web application which helps you to upload your large files and share file via link

dotsend is a web application which helps you to upload your large files and share file via link

Devocoe 0 Dec 03, 2022
A simple file sharing tool written in python

Share it A simple file sharing tool written in python Installation If you are using Windows os you can directly Run .exe file -- download If you are

Sachit Yadav 7 Dec 16, 2022
Lumar - Smart File Creator

Lumar is a free tool for creating and managing files. With Lumar you can quickly create any type of file, add a file content and file size. With Lumar you can also find out if Photoshop or other imag

Paul - FloatDesign 3 Dec 10, 2021
PyDeleter - delete a specifically formatted file in a directory or delete all other files

PyDeleter If you want to delete a specifically formatted file in a directory or delete all other files, PyDeleter does it for you. How to use? 1- Down

Amirabbas Motamedi 1 Jan 30, 2022
An easy-to-use library for emulating code in minidump files.

dumpulator Note: This is a work-in-progress prototype, please treat it as such. An easy-to-use library for emulating code in minidump files. Example T

Duncan Ogilvie 362 Dec 31, 2022
Creates folders into a directory to categorize files in that directory by file extensions and move all things from sub-directories to current directory.

Categorize and Uncategorize Your Folders Table of Content TL;DR just take me to how to install. What are Extension Categorizer and Folder Dumper Insta

Furkan Baytekin 1 Oct 17, 2021
🧹 Create symlinks for .m2ts files and classify them into directories in yyyy-mm format.

🧹 Create symlinks for .m2ts files and classify them into directories in yyyy-mm format.

Nep 2 Feb 07, 2022
RMfuse provides access to your reMarkable Cloud files in the form of a FUSE filesystem

RMfuse provides access to your reMarkable Cloud files in the form of a FUSE filesystem. These files are exposed either in their original format, or as PDF files that contain your annotations. This le

Robert Schroll 82 Nov 24, 2022
Python virtual filesystem for SQLite to read from and write to S3

Python virtual filesystem for SQLite to read from and write to S3

Department for International Trade 70 Jan 04, 2023
Search for files under the specified directory. Extract the file name and file path and import them as data.

Search for files under the specified directory. Extract the file name and file path and import them as data. Based on that, search for the file, select it and open it.

G-jon FujiYama 2 Jan 10, 2022
This is a junk file creator tool which creates junk files in Internal Storage

This is a junk file creator tool which creates junk files in Internal Storage

KiLL3R_xRO 3 Jun 20, 2021
MetaMove is written in Python3 and aims at easing batch renaming operations based on file meta data.

MetaMove MetaMove is written in Python3 and aims at easing batch renaming operations based on file meta data. MetaMove abuses eval combined with f-str

Jan Philippi 2 Dec 28, 2021
useful files for the Freenove Big Hexapod

FreenoveBigHexapod useful files for the Freenove Big Hexapod HexaDogPos is a utility for converting the Freenove xyz co-ordinate system to servo angle

Alex 2 May 28, 2022
A python script generate password files in plain text

KeePass (or any desktop pw manager?) Helper WARNING: This script will generate password files in plain text. ITS NOT SECURE. I needed help remembering

Eric Thomas 1 Nov 21, 2021