I think what you are looking for is a "fuzzy match" on the file basename only.
For what it's worth the later Enterprise versions of Microsoft Excel (I think from 2013 and up) have a basic fuzzy search capability in the PowerQuery plugin, although it's not very well documented or easy to control. So, in principle if you can get your cadidate file lists exported into CSV, you could then "have a go" if you have access to the appropriate Excel license.
In my limited experience of playing with the Excel feature (in a different context to yours) I found it was remarkably hard to get a fuzzy match set up to reliably capture names that are "obviously" related as far as a human observer is concerned, without introducing a high level of false positives.
It's a surprisingly tricky task. My suggestion of a possible approach would be something like-
For what it's worth the later Enterprise versions of Microsoft Excel (I think from 2013 and up) have a basic fuzzy search capability in the PowerQuery plugin, although it's not very well documented or easy to control. So, in principle if you can get your cadidate file lists exported into CSV, you could then "have a go" if you have access to the appropriate Excel license.
In my limited experience of playing with the Excel feature (in a different context to yours) I found it was remarkably hard to get a fuzzy match set up to reliably capture names that are "obviously" related as far as a human observer is concerned, without introducing a high level of false positives.
It's a surprisingly tricky task. My suggestion of a possible approach would be something like-
- Extract your "candidate" files into a structured table including key parameters such as path (your "(Low)" directory for example); filename; encoding; file size; artist; album; covert art etc.
- Preprocess the table content to remove common confusing factors, especially in file and artist names (e.g. "Remix", "Featuring"), and/or harmonise the way common terms are represented. E.g. "Junior" and "Jr." in an artist name mean the same to a human, but look very little alike to a fuzzy search algorithm, so these things need to be harmonised before the comparison
- Feed the result into a tool with a decent fuzzzy match capability, possibly as an extension to one of the well known database tools such as MySQL/MariaDB
Statistics: Posted by incans — Tue Nov 05, 2024 7:58 pm