We investigate the disagreement between two diff algorithms: Myers and Histogram, and take a manual measurement of their quality in generating the diff lists.īased on previous related studies, we investigate the code changes from the files in 14 OSS projects that employ Continuous Integration for metrics collection and 10 Apache projects for the bug introduction identification to quantify the differences of the diff outputs that resulted from both diff algorithms. #ERROR WHILE FETCHING FROM DOI JABREF MANUAL# We analyze the quality of patches derived from Myers and Histogram by manually comparing their two diff from 377 changes, a statistically representative sample of the 21,590 changes identified in the above two comparisons. Our findings show that using various diff algorithms in the git diff command produced unequal diff lists. This influences the different number of files that have dissimilar added and deleted lines of code in each CI-Java project. The differences of these added and deleted lines that are distinguished by their different number and position range from 0.8% to 6.2% and from 1.4% to 7.6%, respectively. ![]() The divergent diff outputs also affected the different number of identified files in bug introduction identification. Regarding the result of the patches analysis, we found that, in-code changes, The percentage of files that have different deleted lines of code range from 2.4% to 6.6%. However, both diff algorithms evenly have a good quality in generating the list of non-code changes. The Histogram strategy works similarly to with the Patience by developing a histogram of the appearances for every line in the first version of a file.Įvery element in the second version is subsequently shown to match with the first sequence in an orderly way to find the existences of the elements and to count the occurrences. If the elements exist and their presences are less than in the first sequence, they are expected to be a potential LCS. Once the screening is finished for the second sequence, the lowest occurrence of LCS is marked as the separator. ![]() Two sections resulting from the partition (i.e. section 1 represents the area before the LCS, while section 2 represents the region after the LCS), are then executed repetitively using the same process as the beginning of the algorithm. This means that the Histogram performs similarly to the Patience if a unique common element exists in both files otherwise, it selects the element that has the least occurrences. In comparison with the other two diff algorithms, (i.e. the Myers and the Patience), the Histogram nevertheless, has been declared much quicker 8 8 8. In contrast with the Myers, the Histogram algorithm provides diff results that are easier for software archives miners to understand, as the Histogram more clearly separates the changed code lines. #ERROR WHILE FETCHING FROM DOI JABREF SOFTWARE# This diff algorithm uses a unique line of code as a benchmark to match the sequences of the changed lines between the two files. #ERROR WHILE FETCHING FROM DOI JABREF CODE# ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |