GITFORMAT−CHUNK - Online Linux Manual Page

Section : 5
Updated : 2023−03−13
Source : Git 2.40.0
Note : Git Manual

NAME

gitformat-chunk − Chunk−based file formats

SYNOPSIS

Used by gitformat-commit-graph

(5) and the "MIDX" format (see the pack format documentation in

gitformat-pack

(5)).

DESCRIPTION

Some file formats in Git use a common concept of "chunks" to describe sections of the file. This allows structured access to a large file by scanning a small "table of contents" for the remaining data. This common format is used by the commit−graph

and

multi−pack−index

files. See the

multi−pack−index

format in

gitformat-pack

(5) and the

commit−graph

format in

gitformat-commit-graph

(5) for how they use the chunks to describe structured data. A chunk−based file format begins with some header information custom to that format. That header should include enough information to identify the file type, format version, and number of chunks in the file. From this information, that file can determine the start of the chunk−based region. The chunk−based region starts with a table of contents describing where each chunk starts and ends. This consists of (C+1) rows of 12 bytes each, where C is the number of chunks. Consider the following table:

| Chunk ID (4 bytes) | Chunk Offset (8 bytes) | |−−−−−−−−−−−−−−−−−−−−|−−−−−−−−−−−−−−−−−−−−−−−−| | ID[0] | OFFSET[0] | | ... | ... | | ID[C] | OFFSET[C] | | 0x0000 | OFFSET[C+1] |

Each row consists of a 4−byte chunk identifier (ID) and an 8−byte offset. Each integer is stored in network−byte order. The chunk identifier

ID[i]

is a label for the data stored within this fill from

OFFSET[i]

(inclusive) to

OFFSET[i+1]

(exclusive). Thus, the size of the

i`th chunk is equal to the difference between `OFFSET[i+1]

and

OFFSET[i]

. This requires that the chunk data appears contiguously in the same order as the table of contents. The final entry in the table of contents must be four zero bytes. This confirms that the table of contents is ending and provides the offset for the end of the chunk−based data. Note: The chunk−based format expects that the file contains

at least

a trailing hash after

OFFSET[C+1]

. Functions for working with chunk−based file formats are declared in

chunk−format.h

. Using these methods provide extra checks that assist developers when creating new file formats.

WRITING CHUNK−BASED FILE FORMATS

To write a chunk−based file format, create a struct chunkfile

by calling

init_chunkfile(

)

and pass a

struct hashfile

pointer. The caller is responsible for opening the

hashfile

and writing header information so the file format is identifiable before the chunk−based format begins. Then, call

add_chunk(

)

for each chunk that is intended for write. This populates the

chunkfile

with information about the order and size of each chunk to write. Provide a

chunk_write_fn

function pointer to perform the write of the chunk data upon request. Call

write_chunkfile(

)

to write the table of contents to the

hashfile

followed by each of the chunks. This will verify that each chunk wrote the expected amount of data so the table of contents is correct. Finally, call

free_chunkfile(

)

to clear the

struct chunkfile

data. The caller is responsible for finalizing the

hashfile

by writing the trailing hash and closing the file.

READING CHUNK−BASED FILE FORMATS

To read a chunk−based file format, the file must be opened as a memory−mapped region. The chunk−format API expects that the entire file is mapped as a contiguous memory region. Initialize a struct chunkfile

pointer with

init_chunkfile(NULL)

. After reading the header information from the beginning of the file, including the chunk count, call

read_table_of_contents(

)

to populate the

struct chunkfile

with the list of chunks, their offsets, and their sizes. Extract the data information for each chunk using

pair_chunk(

)

read_chunk(

)

•

pair_chunk(

)

assigns a given pointer with the location inside the memory−mapped file corresponding to that chunk’s offset. If the chunk does not exist, then the pointer is not modified.

•

read_chunk(

)

takes a

chunk_read_fn

function pointer and calls it with the appropriate initial pointer and size information. The function is not called if the chunk does not exist. Use this method to read chunks if you need to perform immediate parsing or if you need to execute logic based on the size of the chunk.

After calling these methods, call free_chunkfile(

)

to clear the

struct chunkfile

data. This will not close the memory−mapped region. Callers are expected to own that data for the timeframe the pointers into the region are needed.

EXAMPLES

These file formats use the chunk−format API, and can be used as examples for future formats:

•

commit−graph:

see

write_commit_graph_file(

)

and

parse_commit_graph(

)

commit−graph.c

for how the chunk−format API is used to write and parse the commit−graph file format documented in the commit−graph file format in

gitformat-commit-graph

(5).

•

multi−pack−index:

see

write_midx_internal(

)

and

load_multi_pack_index(

)

midx.c

for how the chunk−format API is used to write and parse the multi−pack−index file format documented in the multi−pack−index file format section of

gitformat-pack

(5).

GIT

Part of the git

(1) suite

Johanes Gumabo

Data Size : 17,507 byte

man-gitformat-chunk.5Build :

2025-03-22, 13:26

Visitor Screen :

Visitor Counter ( page / site )

28 / 2,242,844

Visitor ID :

Visitor IP :

216.73.216.210

Visitor Provider :

AMAZON-02

Provider Position ( lat x lon )

39.962500

-83.006100

Provider Accuracy Radius ( km )

Provider City :

Columbus

Provider Province :

Ohio

Provider Country :

United States

Provider Continent :

North America

Visitor Recorder : Version :

Visitor Recorder : Library :

Online Linux Manual Page : Version :

Online Linux Manual Page - Fedora.40 - march=x86-64 - mtune=generic - 25.03.22

Online Linux Manual Page : Library :

lib_c - 24.10.03 - march=x86-64 - mtune=generic - Fedora.40

Online Linux Manual Page : Library :

lib_m - 24.10.03 - march=x86-64 - mtune=generic - Fedora.40

Data Base : Version :

Online Linux Manual Page Database - 24.04.13 - march=x86-64 - mtune=generic - fedora-38

Data Base : Library :

lib_c - 23.02.07 - march=x86-64 - mtune=generic - fedora.36

Very long time ago, I have the best tutor, Wenzel Svojanovsky. If someone knows the email address of Wenzel Svojanovsky, please send an email to johanesgumabo@gmail.com.

Help me, linux0001.com will expire on July 16, 2025. I have no money to renew it. View detail

If error, please print screen and send to johanes_gumabo@yahoo.co.id
Under development. Support me via PayPal.