In a previous post on March 23, 2012, we talked a little about our efforts to create a more useable report for unmatched headings. We have added more functionality to the report that we hope helps clarify the results. Also, we plan to continue to refine the algorithm we use for the near matches as well as the confidence level we have assigned to each near match. Here are a few examples of the report (from its current build): ocm05472887 100 1_ $a Allen, Junius Mordecai. 99.5% no 95045186 400 1_ $a Allen, Junius Mordecai, $d 1875-1906. 56.5% no 00103969 100 1_ $a Allen, Junius, $d 1898-1962 ocm77567496 650 _0 $a Adventure and adventurer $v Fiction. 97.1% sh 85001072 450 __ $a Adventure and adventurers $v Fiction 70.6% sh2009113774 150 __ $a Adventure and adventurers $z Europe $v Biography ocm02224738, ocm02464058, ocm02735261, ocm03462153, ocm04493529 490 0_ $a Old West 99.5% no 96034673 130 _0 $a Old West (Alexandria, Va.) 99.5% n 99000801 151 __ $a Old West Lawrence Historic District (Lawrence, Kan.) Not all near matches will be ranked so high on our "confidence level percentage", but these three should give you a better idea of the report's results. We match as much of the original heading to the near match as possible. Whatever matches on the unmatched heading is highlighted in BLUE. Parts of the near match that are potential typos or new additions not contained in the unmatched heading are offset in RED. Then the second near match is also highlighted similar to the first near match, but in GREEN, to help distinguish between the two near matches. As a next step, we are looking into the possibility of sorting this report based on percentile. So 90 percentile near matches will be listed first (and sorted within that group A-Z). This might take some extra finagling from our programming team to successfully implement, but we will keep you updated on our progress. While the higher percentile near matches are useful for letting you know what may actually be a valid match, we also want to point out that the lower percentile matches are useful in identifying (or dismissing) headings where there exists no near match. Every unmatched heading will have two near matches listed underneath it, even if those near matches are very low probability (less than 5%). This is due to how our algorithm is setup to generate these near matches for the report. This report is called: R00 - Near Match Report.htm Please feel free to contact your project managers in order to request that we start delivering this report with your Current Cataloging results (at no extra cost): Judy Archer (email <mailto:jarcher@bslw.com?subject=R00%20-%20Near%20Match%20Report> ) Stephanie Hansen (email <mailto:shansen@bslw.com?subject=R00%20-%20Near%20Match%20Report> ) We will still be delivering R07 (Unmatched Headings) and R10 (Multiple Authority Matches), so this R00 - Near Match Report won't yet replace those. But since every unmatched heading will have two near matches listed underneath, we do want to point out that it can be quite large depending on the size of your Current Cataloging file (and matching results). We welcome your feedback! Nate Cothran - nate@bslw.com <mailto:nate@bslw.com?subject=Automation%20Services%20-%20Query> Product Manager, Automation Backstage Library Works 533 E 1860 S, Provo UT 84606 (p) 801.342.5697 - (f) 801.356.8220 www.ac.bslw.com/community/blog <http://ac.bslw.com/community/blog/>