Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem processing Microsoft's authroot.stl #199

Open
xulfir opened this issue Jan 20, 2021 · 1 comment
Open

Problem processing Microsoft's authroot.stl #199

xulfir opened this issue Jan 20, 2021 · 1 comment
Labels

Comments

@xulfir
Copy link

xulfir commented Jan 20, 2021

Hi, I've been having an issue processing Microsoft's authroot.stl with asn1crypto 1.4.0 and I'm not positive if it's my fault or not. From my investigation, it seems to be related to how Sets are processed, but perhaps I'm using them incorrectly.
To reproduce my issue, you can use code like this:

# python3

from asn1crypto.cms import ContentInfo
from asn1crypto.core import OctetString, Set, Sequence

# Downloaded and extracted from http://www.download.windowsupdate.com/msdownload/update/v3/static/trustedr/en/authrootstl.cab
with open("authroot.stl", "rb") as authroot:
    data = authroot.read()

info = ContentInfo.load(data)

# The field I'm having trouble with is a Set of Sequences
info['content']['encap_content_info']['content'].parsed[4][0][1]

# <asn1crypto.core.Set 2604051076872 b'1\x82\x01\x100\x18\x06\n+\x06\x01\x04\x01\x827\n\x0b~1\n\x04\x08\x00\x00\xd9\xb5D\xc1\xd2\x010\x1e\x06\n+\x06\x01\x04\x01\x827\n\x0bi1\x10\x04\x0e0\x0c\x06\n+\x06\x01\x04\x01\x827<\x03\x020 \x06\n+\x06\x01\x04\x01\x827\n\x0b\x1d1\x12\x04\x10\xf0\xc4\x02\xf0@N\xa9\xad\xbf%\xa0=\xdf,\xa6\xfa0$\x06\n+\x06\x01\x04\x01\x827\n\x0b\x141\x16\x04\x14\x0e\xac\x82`@V\'\x97\xe5%\x13\xfc*\xe1\nS\x95Y\xe4\xa400\x06\n+\x06\x01\x04\x01\x827\n\x0bb1"\x04 \x88]\xe6L4\x0e>\xa7\x06X\xf0\x1e\x11E\xf9W\xfc\xda\'\xaa\xbe\xea\x1a\xb9\xfa\xa9\xfd\xb0\x10-@w0Z\x06\n+\x06\x01\x04\x01\x827\n\x0b\x0b1L\x04JM\x00i\x00c\x00r\x00o\x00s\x00o\x00f\x00t\x00 \x00R\x00o\x00o\x00t\x00 \x00C\x00e\x00r\x00t\x00i\x00f\x00i\x00c\x00a\x00t\x00e\x00 \x00A\x00u\x00t\x00h\x00o\x00r\x00i\x00t\x00y\x00\x00\x00'>

# When I try to view the native entry, I get an error. Similar problems happen if I try to subscript to reach individual members.
info['content']['encap_content_info']['content'].parsed[4][0][1].native

# ValueError: Data for field 0 (universal class, constructed method, tag 16) does not match any of the field definitions
#     while parsing asn1crypto.core.Set

From what I can tell, the ASN.1 is properly formatted, but it seems there might be a problem with Set or how I'm using it. Some examples:

# We can process a sequence of an OID fine...
# SEQUENCE (1 elem)
#   OBJECT IDENTIFIER 1.3.6.1.4.1.311.10.11.126
Sequence.load(bytes.fromhex('300c060a2b0601040182370a0b7e')).native

# OrderedDict([('0', '1.3.6.1.4.1.311.10.11.126')])


# ...but wrap it in a Set and we get errors again
# SET (1 elem)
#   SEQUENCE (1 elem)
#     OBJECT IDENTIFIER 1.3.6.1.4.1.311.10.11.126
Set.load(bytes.fromhex('310e300c060a2b0601040182370a0b7e')).native

# ValueError: Data for field 0 (universal class, constructed method, tag 16) does not match any of the field definitions
#     while parsing asn1crypto.core.Set


# A similar issue can happen for other types of data. Later on in the original structure,
# there's a Set containing an OctetString that would apparently also fail.
# SET (1 elem)
#   OCTET STRING (8 byte) 0000D9B544C1D201
Set.load(bytes.fromhex('310a04080000d9b544c1d201')).native

# ValueError: Data for field 0 (universal class, primitive method, tag 4) does not match any of the field definitions
#     while parsing asn1crypto.core.Set


# The contained OctetString processes without issue
# OCTET STRING (8 byte) 0000D9B544C1D201
OctetString.load(bytes.fromhex('04080000d9b544c1d201')).native

# b'\x00\x00\xd9\xb5D\xc1\xd2\x01'

Any help would be appreciated. Thanks for taking a look!

@wbond wbond added the question label Aug 7, 2021
@geitda
Copy link
Contributor

geitda commented May 13, 2022

Here's a very hacky workaround to get it to actually spit out data for you. I decided to call the huge sequence of 437 elements a RootList and define enough structure to get access to everything:

class RootListAttribute(Sequence):
    _fields = [
        ('type', ObjectIdentifier),
        ('values', None),
    ]

    # If we know anything about the inner OIDs we can make intelligent parsing decisions
    # Otherwise, just parse as SetOfAny
    _oid_specs = {}

    def _values_spec(self):
        return self._oid_specs.get(self['type'].native, SetOfAny)

    _spec_callbacks = {
        'values': _values_spec
    }

class RootListAttributes(SetOf):
    _child_spec = RootListAttribute

class RootListEntry(Sequence):
    _fields = [
        ('digest', OctetString),
        ('attributes', RootListAttributes),
    ]

class RootList(SequenceOf):
    _child_spec = RootListEntry

This gives you enough access that .native works:

# Re-parse as 'RootList' type
>>> rl = RootList.load(info['content']['encap_content_info']['content'].parsed[4].dump())
>>> len(rl)
437
>>> rl[0].native
OrderedDict([
  ('digest', b'\xcd\xd4\xee\xae`\x00\xac\x7f@\xc3\x80,\x17\x1e0\x14\x800\xc0r'),
  ('attributes', [
    OrderedDict([('type', '1.3.6.1.4.1.311.10.11.104'), ('values', [b'\x00\x80\xc8+h\x86\xd7\x01'])]),
    OrderedDict([('type', '1.3.6.1.4.1.311.10.11.126'), ('values', [b'\x00\x00\xd9\xb5D\xc1\xd2\x01'])]),
    OrderedDict([('type', '1.3.6.1.4.1.311.10.11.105'), ('values', [b'0\x0c\x06\n+\x06\x01\x04\x01\x827<\x03\x02'])]),
    OrderedDict([('type', '1.3.6.1.4.1.311.10.11.29'), ('values', [b'\xf0\xc4\x02\xf0@N\xa9\xad\xbf%\xa0=\xdf,\xa6\xfa'])]),
    OrderedDict([('type', '1.3.6.1.4.1.311.10.11.20'), ('values', [b"\x0e\xac\x82`@V'\x97\xe5%\x13\xfc*\xe1\nS\x95Y\xe4\xa4"])]),
    OrderedDict([('type', '1.3.6.1.4.1.311.10.11.98'), ('values', [b"\x88]\xe6L4\x0e>\xa7\x06X\xf0\x1e\x11E\xf9W\xfc\xda'\xaa\xbe\xea\x1a\xb9\xfa\xa9\xfd\xb0\x10-@w"])]),
    OrderedDict([('type', '1.3.6.1.4.1.311.10.11.11'), ('values', [b'M\x00i\x00c\x00r\x00o\x00s\x00o\x00f\x00t\x00 \x00R\x00o\x00o\x00t\x00 \x00C\x00e\x00r\x00t\x00i\x00f\x00i\x00c\x00a\x00t\x00e\x00 \x00A\x00u\x00t\x00h\x00o\x00r\x00i\x00t\x00y\x00\x00\x00'])])
  ])
])
# This .native call is error-free!
>>> native = rl.native
>>> len(native)
437

This should let you get a little farther, just be sure to call rl[i]['attributes'][j]['values'][0].native for the i-th element in the huge list, and the j-th element in each sub-list, and the 0-th 'value' (there's always only one, it looks like) to get the inner data from the OctetString. Sometimes that inner data is more ASN.1, sometimes not. You'd have to know all those OIDs to know which are which, and it seems like Microsoft has almost zero documentation covering this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants