Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constructing LinkbaseDocument fails when DocumentPath is an absolute URI #28

Open
jfoshee opened this issue Oct 5, 2019 · 1 comment

Comments

@jfoshee
Copy link

jfoshee commented Oct 5, 2019

Repro Steps

// Load Campbell Soup 2019 10-K
new XbrlDocument()
  .Load("https://www.sec.gov/Archives/edgar/data/16732/000001673219000070/cpb-20190728.xml");

Symptoms

System.Net.WebException : The remote server returned an error: (404) Not Found.

StackTrace
at System.Net.HttpWebRequest.GetResponse()
at System.Xml.XmlDownloadManager.GetNonFileStream(Uri uri, ICredentials credentials, IWebProxy proxy, RequestCachePolicy cachePolicy)
at System.Xml.XmlDownloadManager.GetStream(Uri uri, ICredentials credentials, IWebProxy proxy, RequestCachePolicy cachePolicy)
at System.Xml.XmlUrlResolver.GetEntity(Uri absoluteUri, String role, Type ofObjectToReturn)
at System.Xml.XmlTextReaderImpl.OpenUrl()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.XmlLoader.Load(XmlDocument doc, XmlReader reader, Boolean preserveWhitespace)
at System.Xml.XmlDocument.Load(XmlReader reader)
at System.Xml.XmlDocument.Load(String filename)
at JeffFerguson.Gepsio.Xml.Implementation.SystemXml.Document.Load(String path) in D:\gepsio\JeffFerguson.Gepsio\Xml\Implementations\SystemXml\Document.cs:line 29
at JeffFerguson.Gepsio.LinkbaseDocument..ctor(String ContainingDocumentUri, String DocumentPath) in D:\gepsio\JeffFerguson.Gepsio\LinkbaseDocument.cs:line 25
at JeffFerguson.Gepsio.DefinitionLinkbaseDocument..ctor(String ContainingDocumentUri, String DocumentPath) in D:\gepsio\JeffFerguson.Gepsio\DefinitionLinkbaseDocument.cs:line 17
at JeffFerguson.Gepsio.LinkbaseDocumentCollection.ReadLinkbaseReference(String ContainingDocumentUri, INode LinkbaseReferenceNode) in D:\gepsio\JeffFerguson.Gepsio\LinkbaseDocumentCollection.cs:line 146
at JeffFerguson.Gepsio.LinkbaseDocumentCollection.ReadLinkbaseReferences(String ContainingDocumentUri, INode parentNode) in D:\gepsio\JeffFerguson.Gepsio\LinkbaseDocumentCollection.cs:line 133
at JeffFerguson.Gepsio.XbrlSchema.ReadAppInfo(INode AppInfoNode) in D:\gepsio\JeffFerguson.Gepsio\XbrlSchema.cs:line 456
at JeffFerguson.Gepsio.XbrlSchema.ReadAnnotations(INode AnnotationNode) in D:\gepsio\JeffFerguson.Gepsio\XbrlSchema.cs:line 448
at JeffFerguson.Gepsio.XbrlSchema.LookForAnnotations() in D:\gepsio\JeffFerguson.Gepsio\XbrlSchema.cs:line 437
at JeffFerguson.Gepsio.XbrlSchema..ctor(XbrlFragment ContainingXbrlFragment, String SchemaFilename, String BaseDirectory) in D:\gepsio\JeffFerguson.Gepsio\XbrlSchema.cs:line 217
at JeffFerguson.Gepsio.XbrlSchemaCollection.GetSchemaFromTargetNamespace(String targetNamespace, XbrlFragment parentFragment) in D:\gepsio\JeffFerguson.Gepsio\XbrlSchemaCollection.cs:line 311
at JeffFerguson.Gepsio.Item.GetSchemaElementFromSchema() in D:\gepsio\JeffFerguson.Gepsio\Item.cs:line 235
at JeffFerguson.Gepsio.Item..ctor(XbrlFragment ParentFragment, INode ItemNode) in D:\gepsio\JeffFerguson.Gepsio\Item.cs:line 133
at JeffFerguson.Gepsio.Fact.Create(XbrlFragment ParentFragment, INode FactNode) in D:\gepsio\JeffFerguson.Gepsio\Fact.cs:line 52
at JeffFerguson.Gepsio.XbrlFragment.ReadFacts() in D:\gepsio\JeffFerguson.Gepsio\XbrlFragment.cs:line 515
at JeffFerguson.Gepsio.XbrlFragment..ctor(XbrlDocument ParentDocument, INamespaceManager namespaceManager, INode XbrlRootNode) in D:\gepsio\JeffFerguson.Gepsio\XbrlFragment.cs:line 169
at JeffFerguson.Gepsio.XbrlDocument.Parse(IDocument doc) in D:\gepsio\JeffFerguson.Gepsio\XbrlDocument.cs:line 275
at JeffFerguson.Gepsio.XbrlDocument.Load(String Filename) in D:\gepsio\JeffFerguson.Gepsio\XbrlDocument.cs:line 179

Analysis

The problem arises because an invalid Linkbase Path is constructed by concatenating two absolute URIs. This leads to the 404.

We can see where things start to go wrong in this call to build a collection of linkbase references. Note the xlink:href in the 2nd one is absolute.

LinkbaseDocumentCollection.ReadLinkbaseReferences(
    string ContainingDocumentUri = "http://xbrl.fasb.org/us-gaap/2018/elts/us-gaap-2018-01-31.xsd", 
    INode parentNode = { ChildNodes = [
        <link:linkbaseRef xlink:arcrole="http://www.w3.org/1999/xlink/properties/linkbase" 
                        xlink:role="http://www.xbrl.org/2003/role/definitionLinkbaseRef" 
                        xlink:type="simple" 
                        xlink:href="../elts/us-gaap-eedm-def-2018-01-31.xml" 
                        xmlns:xlink="http://www.w3.org/1999/xlink" 
                        xmlns:link="http://www.xbrl.org/2003/linkbase" />,
        <link:linkbaseRef xlink:arcrole="http://www.w3.org/1999/xlink/properties/linkbase" 
                        xlink:role="http://www.xbrl.org/2003/role/definitionLinkbaseRef" 
                        xlink:type="simple" 
                        xlink:href="http://xbrl.fasb.org/srt/2018/elts/srt-eedm1-def-2018-01-31.xml" 
                        xmlns:xlink="http://www.w3.org/1999/xlink" 
                        xmlns:link="http://www.xbrl.org/2003/linkbase" />
    ]  } )

This calls into:

private void ReadLinkbaseReference(
    string ContainingDocumentUri = "http://xbrl.fasb.org/us-gaap/2018/elts/us-gaap-2018-01-31.xsd", 
    INode LinkbaseReferenceNode = ...)

        xlinkNode.Href = "http://xbrl.fasb.org/srt/2018/elts/srt-eedm1-def-2018-01-31.xml"


private string GetFullLinkbasePath(string ContainingDocumentUri, string LinkbaseDocFilename)
    LinkbaseDocFilename = "http://xbrl.fasb.org/srt/2018/elts/srt-eedm1-def-2018-01-31.xml"
    DocumentPath = "http://xbrl.fasb.org/us-gaap/2018/elts/"
    // Constructs an invalid path:
            FullPath = DocumentPath + LinkbaseDocFilename;

Then the exception is thrown in the LinkbaseDocument constructor:

internal LinkbaseDocument(string ContainingDocumentUri, string DocumentPath)
    thisLinkbasePath = "http://xbrl.fasb.org/us-gaap/2018/elts/http://xbrl.fasb.org/srt/2018/elts/srt-eedm1-def-2018-01-31.xml"
    // Fails:
    thisXmlDocument.Load(thisLinkbasePath);

I am new to XBRL and Gepsio, so I'm uncertain where best to make the fix. However it does seem clear that the concatenation shouldn't happen between two absolute URIs. I suspect that GetFullLinkbasePath should check if the LinkbaseDocFilename is an absolute URI and simply return that if it is.

I will continue to investigate and submit a pull request if I can get it to work. Any direction is welcome.

@jfoshee
Copy link
Author

jfoshee commented Oct 5, 2019

Now I see this issue was resolved in the develop branch with commit e9bf15a

Anything preventing this from merging to master and being released? This should close this issue and #22.

I do have a coding question: The use of System.Uri seems conspicuously absent from GetFullLinkbasePath(). Is there a reason to hand-parse URIs as opposed to using System.Uri? For example, the code will break again as more schemas are moved to https.

An alternative to

if (LinkbaseDocFilename.StartsWith("http://") == true)

may be

if (Uri.IsWellFormedUriString(LinkbaseDocFilename, UriKind.Absolute))

Similarly it seems strange to use System.IO.Path.DirectorySeparatorChar as that will cause the code to work differently on *nix versus Windows for the same xml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant