Collation: Prabhed and Its Predecessors
- 285 Downloads
This chapter first describes two earlier collation programs created by the Jadavpur School, Tafat 1.0 and Pathantar, explaining the logic behind each with its advantages and disadvantages. Two major challenges were dealing with translocations of material, and distinguishing between prose and verse. The display design for Pathantar, retained in Bichitra, is also described. There follows a full account of Prabhed, the collation program used for Bichitra. Uniquely, this program compares texts at three levels: section (chapter, scene, canto), segment (paragraph, speech, stanza) and word. It thereby combines two functions: ‘gross collation’ of large text blocks with ‘fine collation’ of individual words and phrases using a subprogram, Tafat 2.0. But first comes the basic problem of determining what counts as a match, and working out match percentages. To this end, the gross collation program incorporates many counter-checks and adjustments. Prabhed compares every character with every other, avoiding heuristic methods. This allows it to record translocations in a new and accurate way, spanning the entire work. Like all collation softwares, it compares only two texts at a time, but can combine many such comparisons to effectively provide an n:n collation. It has separate ‘standard’ and ‘linefeed’ parsers for prose and verse respectively. It also has a 64-bit version enabling collation of very large texts. Instead of dynamic on-site collation, Bichitra uploads a pre-processed set of results obtained by Prabhed.