Perhaps the biggest thing to decide upon is whether you need to go down to the specific page level to figure out the source system or if you can define this at the site or some other level (and then apply that to all the URLs within that site).
Chimera can scrape this information out, or, if you have defined which subdomains (or other grouping) use which source systems then you can use maps (or just a formula, depending on the complexity) to derive this.