Skip to content
Snippets Groups Projects
Select Git revision
  • master
  • first-paper-code
  • old_heuristic
  • experiments
  • tmp_old_code
5 results

incompatible-jpeg-blocks

  • Warning

    This repository is no longer maintained, please refer to this one.

    Finding incompatible JPEG blocks

    Repository for the paper Finding Incompatible Blocks for Reliable JPEG Steganalysis (E. Levecque, J. Butora, P. Bas)

    The JPEG compression can be seen as a mathematical function that maps a pixel block to a DCT block. But if you modify your DCT block, does it still have a pixel antecedent? If the answer is no, then this block is incompatible!

    Our work shows that most of the time at QF100, the bigger the modification, the more likely your block will be incompatible.

    This method can reliably detect steganography messages embedded in an image at QF100 and outperform the other methods, especially for very small messages.

    Installation

    Require Python >= 3.8.

    Clone the repository:

    git clone https://gitlab.cristal.univ-lille.fr/elevecqu/incompatible-jpeg-blocks.git

    Install the requirements present in the repository files:

    pip install -r requirements.txt

    Examples

    If you have an image, and you want to play with it:

    from antecedent.image import Image
    from antecedent.pipeline import create_pipeline
    
    path = r"path/to/my/image"
    
    img = Image('toy_image', path)
    pipeline = create_pipeline('islow', 100, img.is_grayscale, target_is_dct=img.is_jpeg)  # Islow pipeline with QF100
    
    # Example for a double compression pipeline:
    # pipeline = create_pipeline(['islow', 'naive'], [90,100], img.is_grayscale, target_is_dct=img.is_jpeg) 
    
    img.set_pipeline(pipeline)
    img.set_selection_parameter('random', 10)  # 10% of the blocks randomly
    
    img.search_antecedent(1000)
    
    for pos, block in img.block_collection.items():
        if pipeline in block.status:
            # status is:
            # 3 if block has been ignored
            # 1 if an antecedent has been found
            # 0 otherwise (potentially incompatible)
            # -1 for incompatible
            status = block.status[pipeline]
            antecedent = block.antecedents[pipeline]
            iteration = block.iterations[pipeline]

    If you have a single block:

    import numpy as np
    from antecedent.block import Block
    from antecedent.pipeline import create_pipeline
    
    array = np.random.randint(-1016, 1016, size=(8, 8))  # random DCT block
    block = Block(array)
    pipeline = create_pipeline('float', 90, grayscale=True, target_is_dct=True)  # Float pipeline with QF90 for grayscale blocks
    
    block.search_antecedent(pipeline, 1000)
    antecedent = block.antecedents[pipeline]
    iteration = block.iterations[pipeline]
    if antecedent is None:
        print('No antecedent found, could be incompatible.')
    else:
        print(f"Compatible!\nFound an antecedent in {iteration} iterations: \n{antecedent}")

    Dataset usage

    If you have a dataset of images or want to customize your search, please use the config file as follows.

    experiment_name: Experiment Name
    
    seed: 123

    Every experiment with the same name will be stored in the same directory. Seed is for reproducibility.

    data:
      input_path: "path/to/my/image/folder"
      output_path: "path/to/my/output/folder"
      starting_index: 0 # start at the first image of the folder
      ending_index: -1 # end at the last image of the folder

    starting_index and ending_index can be used to select a subset of images. For example, if you have 50 images in your folder, and you set starting_index=45 and ending_index=-1. It will only work with the 5 last images in ascii filename order.

      preprocessing:
        avoid_clipped: True
        avoid_uniform: True
        percentage_block_per_image: 100
        sorting_method: variance

    Parameters to filter out clipped or uniform blocks (recommended) but also to select a subset of blocks in the image. Most of the time, 10% of the blocks are sufficient to classify the image as modified or not. See this section for more details.

    antecedent_search:
      max_workers: null # int, null to use all available CPUs
      pipeline: naive
      quant_tbl: 100

    Definition of the expected pipeline for the image. See the Pipeline section for more details.

      heuristic_solver:
        use: True
        max_iteration_per_block: 1000

    The heuristic solver is a local search to find antecedents. It is quite fast but cannot prove that a block is incompatible.

      gurobi_solver: # !! only possible if pipeline == naive !!
        use: False
        max_iteration_per_block: 1000
        mip_focus: 1 # 1, 2 or 3
        threads: 3 # !! total number of jobs will be threads * worker !!
        cutoff: 0.500001
        node_file_start: 0.5 # null for no RAM usage limit

    Gurobi is a licensed ILP solver but free licenses are available with a public institution. This solver can only be used for JPEG images at QF100 with a naive pipeline.

    With your custom config file, run the following command :

    python3 -m antecedent.run_on_dataset <your_config.yaml> --verbose

    Pipeline

    There are currently 4 pipelines implemented:

    Class name parameter name Description
    NaivePipeline naive Mathematical DCT using the scipy.fft.dctn. Most precise but rarely used for images.
    IslowPipeline islow Libjpeg islow. Uses a 32 bits integer DCT algorithm.
    FloatPipeline float Libjpeg float. Uses a floating-point DCT algorithm.
    IfastPipeline ifast Libjpeg ifast. Uses a 16 bits integer DCT algorithm.

    You can add your custom pipeline to use different DCT algorithms or rounding functions. To do so, modify the pipeline.py with a new class with the following methods:

    from antecedent.pipeline import JPEGPipeline
    
    
    class YourCustomPipeline(JPEGPipeline):
        def __init__(self, quality, grayscale, target_is_dct):
            super().__init__('your_custom_name', quality, grayscale, target_is_dct)
    
        @classmethod
        def is_named(cls, name):
            return name == 'your_custom_name'
    
        # Necessary to run the code
        def dct(self, blocks):
            pass
    
        def idct(self, blocks):
            pass
    
        # Not necessary but available to custom your pipeline
        def round_dct(self, blocks):
            pass
    
        def round_pixel(self, blocks):
            pass
    
        def rgb_to_ycc(self, blocks):
            pass
    
        def ycc_to_rgb(self, blocks):
            pass
    
        def define_quant_table(self):
            # Must define self.quant_tbl
            pass

    Some useful functions can be found in jpeg_toolbox.py.

    Double compression pipeline

    Suppose you observe some JPEG images C and want to find antecedents A through a double compression pipeline:

    A (DCT) ---- pipe 1 ----> B (Pixel) ---- pipe 2 ----> C (DCT)

    with:

    • pipe 1: Naive pipeline with QF75
    • pipe 2: Islow pipeline with QF98

    You can define the pipelines in the config file as follows:

    antecedent_search:
      max_workers: null # int, null to use all available CPUs
      pipeline: [naive, islow]
      quant_tbl:[75, 98]

    Or directly in python as follows:

    from antecedent.pipeline import create_pipeline
    
    grayscale = True  # depends on your image/block
    
    pipeline = create_pipeline(['naive', 'islow'], [75, 98], grayscale=grayscale, target_is_dct=True)

    Blocks selection and filtering

    If you have a JPEG image at QF100 and you want to know if it has been modified, we have shown in our paper that we can select blocks that are more likely to be incompatible by using the variance of the rounding error. This selection will reduce the number of antecedent searches that are costly in time and keep very good (sometimes better) results.

    parameter name Description
    random Random selection
    variance Select blocks with the highest spatial rounding error variance (Not implemented yet)
    pmap Select blocks with the highest probability of modification (Not implemented yet)

    Citation

    If you use our work, please use this citation:

    @article{levecque2024finding,
      title={Finding Incompatibles Blocks for Reliable JPEG Steganalysis},
      author={Levecque, Etienne and Butora, Jan and Bas, Patrick},
      journal={arXiv preprint arXiv:2402.13660},
      year={2024}
    }

    License

    This work is under MIT License.