Skip to content

jobs-github/yas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

简体中文

YAS (Yet Another Schema)

Copyright (c) 2016

Definition

YAS (Yet Another Schema) is a general specification to describe the data format, something of the kind .proto in protobuf.

Back to top

Features

  • Succinct
    Few tags for grammar. Everything can be done with only 7 tags.
  • Cross-Protocol
    Support json, xml, binary.
  • Flexible
    Refer to Extension.

Back to top

Specification

Format

YAS uses json as description format.

Back to top

Types

The built-in types of YAS include the following:

  • bool
  • int8_t
  • uint8_t
  • int16_t
  • uint16_t
  • int32_t
  • uint32_t
  • int64_t
  • uint64_t
  • float
  • double
  • string

Back to top

Array

You can define array like this:

[type-str]

The type of type-str is unlimited.

Back to top

Dict

You can define dict like this:

{type-str}

The type of type-str is unlimited.

Back to top

Nested Object

Sample

  • Different types of objects could have nested structure, for example, you can define object_base_t inside of object_t .
  • Nested object is supposed to be defined first. For instance, object_base_t should be defined before object_t.
  • Self-nested object is NOT allowed.

Back to top

Tags

The following is the tags' structure supported by YAS:

structs
  struct
    type
    members
      member
        field_type
        field_name

Comment: As structs stand for array, we need struct as transitional tag to describe the item of array. Such tags will NOT exist in schema as individual key.

structs

Type: array

Value: [<struct>, <struct> ... <struct>]

Attribute: required

Parent: none

The type list of object, each of them is a struct.

Back to top

struct

Type: object

Value: { "name": <name-str>, "members": <members> }

Attribute: required

Parent: structs

To describe a single data struct, including the following members:

Back to top

type

Type: string

Value: <type-str>

Attribute: required

Parent: struct

To specify the type of object. Duplicated types are not allowed .

Recommended name-style:

  • all lowercase
  • with underscores between word
  • with a trailing _t

Back to top

members

Type: array

Value: [<member>, <member> ... <member>]

Attribute: required

Parent: struct

The member list of object, each of them is a member

Back to top

member

Type: array

Value: [<field_type>, <field_name>]

Attribute: required

Parent: members

To define a member of object, including the following items:

Back to top

field_type

Type: string

Value: <type-str>

Attribute: required

Parent: member

To specify the type of field.

Back to top

field_name

Type: string

Value: <name-str>

Attribute: required

Parent: member

To specify the name of field. Duplicated names are not allowed in the same object .

Back to top

Extension

Multilayer Nested Objects

  • array and dict could be nested with each other. The nested level is unlimited .

For example:

[{[object_base_t]}]

It will be parsed as the following type in C++:

std::vector < std::map < std::string, std::vector <object_base_t> > >

Back to top

Extended Tags

Text protocol:

member
  field_type
  field_name
  default_value

Binary protocol:

member
  field_type
  field_name
  field_id

member

Type: array

Value: [<field_type>, <field_name>, <default_value>]

Attribute: required

Parent: members

To define a member of object, including the following items:

Back to top

default_value

Type: string

Value: <value-str>

Attribute: optional

Parent: member

To specify the default value for field. It should be string.

It's incorrect to write like this:

...
["bool", "bool_val", true],
["int8_t", "int8_val", 'a'],
["int32_t", "int32_val", -128]
...

The right way:

...
["bool", "bool_val", "true"],
["int8_t", "int8_val", "'a'"],
["int32_t", "int32_val", "-128"]
...

string type could be defined like this:

["string", "str_val", "test"]

Back to top

member

Type: array

Value: [<field_type>, <field_name>, <field_id>]

Attribute: required

Parent: members

To define a member of object, including the following items:

Back to top

field_id

Type: string

Value: <id-str>

Attribute: optional

Parent: member

To specify the unique digital ID for field. It should be string. Duplicated IDs are not allowed in the same object .

It's incorrect to write like this:

...
["bool", "bool_val", 100],
["int8_t", "int8_val", 101],
["int32_t", "int32_val", 102]
...

The right way:

...
["bool", "bool_val", "100"],
["int8_t", "int8_val", "101"],
["int32_t", "int32_val", "102"]
...

Back to top

Samples

Cross-Protocol

{
    "structs": 
    [
        {
            "type": "perf_object_t",
            "members": 
            [
                ["bool", "bool_val"],
                ["int8_t", "int8_val"],
                ["uint8_t", "uint8_val"],
                ["int16_t", "int16_val"],
                ["uint16_t", "uint16_val"],
                ["int32_t", "int32_val"],
                ["uint32_t", "uint32_val"],
                ["int64_t", "int64_val"],
                ["uint64_t", "uint64_val"],
                ["float", "float_val"],
                ["double", "double_val"],
                ["string", "str_val"],
                ["[int32_t]", "vec_val"],
                ["{string}", "dict_val"]
            ]
        }
    ]
}

Comment: this is the most generic way, which works well both in text protocol and binary protocol.

Back to top

Text Protocol

{
    "structs": 
    [
        {
            "type": "sample_struct_t",
            "members": 
            [
                ["bool", "bool_val", "true"],
                ["string", "str_val", "test"],
                ["int32_t", "int_val", "-111111"],
                ["uint32_t", "uint_val", "111111"],
                ["double", "double_val", "111111.111111"],
                ["int8_t", "char_val", "'a'"],
                ["uint8_t", "uchar_val", "128"],
                ["int16_t", "short_val", "-256"],
                ["uint16_t", "ushort_val", "512"],
                ["int64_t", "int64_val", "-9223372036854775807"],
                ["uint64_t", "uint64_val", "0xffffffffffffffff"],
                ["float", "float_val", "111111.111111"],
                ["[int32_t]", "vec_val"],
                ["{string}", "str_map_val"]
            ]
        }
    ]
}

Comment: this example uses the default value tag in extended specification.

Back to top

Binary Protocol

{
    "structs": [
        {
            "type": "sample_struct_t",
            "members": [
                ["int8_t", "int8_val", "1"],
                ["uint8_t", "uint8_val"],
                ["string", "str_val", "10000"],
                ["[string]", "str_arr_val"],
                ["{string}", "str_dict_val", "2"]
            ]
        },
        {
            "type": "sample_object_t",
            "members": [
                ["sample_struct_t", "obj"],
                ["[sample_struct_t]", "arr"],
                ["{sample_struct_t}", "dict"]
            ]
        }
    ]
}

Comment: this example uses the field id tag in extended specification.

Back to top

Multilayer Nested Objects

Text protocol:

{
    "structs": 
    [
        {
            "type": "object_base_t",
            "members": 
            [
                ["bool", "bool_val", "true"],
                ["int8_t", "int8_val"],
                ["uint8_t", "uint8_val"],
                ["int16_t", "int16_val"],
                ["uint16_t", "uint16_val"],
                ["int32_t", "int32_val", "-128"],
                ["uint32_t", "uint32_val", "65536"],
                ["int64_t", "int64_val"],
                ["uint64_t", "uint64_val", "0xffffffffffffffff"],
                ["float", "float_val"],
                ["double", "double_val"],
                ["string", "str_val", "test"],
                ["[int32_t]", "vec_val"],
                ["{string}", "dict_val"]
            ]
        },
        {
            "type": "object_t",
            "members": 
            [
                ["object_base_t", "obj_val"],
                ["[object_base_t]", "obj_vec_val"],
                ["{object_base_t}", "obj_dict_val"],
                ["{[object_base_t]}", "obj_vec_dict_val"],
                ["[{object_base_t}]", "obj_dict_vec_val"],
                ["[[object_base_t]]", "obj_vec_vec_val"],
                ["{{object_base_t}}", "obj_dict_dict_val"],
                ["[{[object_base_t]}]", "obj_vec_dict_vec_val"]
            ]
        }
    ]
}

Binary protocol:

{
    "structs": 
    [
        {
            "type": "object_base_t",
            "members": 
            [
                ["bool", "bool_val", "100"],
                ["int8_t", "int8_val"],
                ["uint8_t", "uint8_val"],
                ["int16_t", "int16_val"],
                ["uint16_t", "uint16_val"],
                ["int32_t", "int32_val", "101"],
                ["uint32_t", "uint32_val", "102"],
                ["int64_t", "int64_val"],
                ["uint64_t", "uint64_val", "103"],
                ["float", "float_val"],
                ["double", "double_val"],
                ["string", "str_val", "110"],
                ["[int32_t]", "vec_val"],
                ["{string}", "dict_val"]
            ]
        },
        {
            "type": "object_t",
            "members": 
            [
                ["object_base_t", "obj_val"],
                ["[object_base_t]", "obj_vec_val"],
                ["{object_base_t}", "obj_dict_val"],
                ["{[object_base_t]}", "obj_vec_dict_val"],
                ["[{object_base_t}]", "obj_dict_vec_val"],
                ["[[object_base_t]]", "obj_vec_vec_val"],
                ["{{object_base_t}}", "obj_dict_dict_val"],
                ["[{[object_base_t]}]", "obj_vec_dict_vec_val"]
            ]
        }
    ]
}

Back to top

Appendix

FAQ

Q: What's the reason to use YAS, not the schema of protobuf, thrift and so on ?

We need to do honor to the industrial standards such as protobuf, thrift as they are extremely powerful. ^_^
However, we need to bear the following disadvantages:
Firstly, the incompatible schemas, as they have different "farthers";
Secondly, the incompatible protocols, as they have different "farthers";
Thirdly, their schemas are not Cross-Protocol (including protobuf, thrift, flatbuffers);
Finally, the complicated grammar with far more than 7 tags, either protobuf, or thrift.

Based on the reasons above, YAS comes out. It is designed according to the following aims:

  • Succinct
    7 tags at most to describe the grammar.
  • Cross-Protocol
    One schema, multiple protocols.
  • Flexible
    By Extension, it's easy to define the Multilayer Nested Objects, which is not supported by either protobuf or thrift.

Back to top

Q: How to define Multilayer Nested Objects in Specification, not Extension?

It's not allowd to write like this:

{
    "structs": 
    [
        {
            "type": "object_base_t",
            "members": 
            [
                ["bool", "bool_val"]
            ]
        },
        {
            "type": "object_t",
            "members": 
            [
                ["{[object_base_t]}", "obj_vec_dict_val"]
            ]
        }
    ]
}

Here is a workround with intermediate object :

{
    "structs": 
    [
        {
            "type": "object_base_t",
            "members": 
            [
                ["bool", "bool_val"]
            ]
        },
        {
            "type": "object_base_array_t",
            "members": 
            [
                ["[object_base_t]", "obj_arr_val"]
            ]
        },
        {
            "type": "object_t",
            "members": 
            [
                ["{object_base_array_t}", "obj_vec_dict_val"]
            ]
        }
    ]
}

Back to top

Q: Why dict only supports string as key? Any plan to support integer?

For one thing, the key words of YAS is nothing but succinct ;
for another thing, string as key works in most cases . Support other types will complicate design.

Good design is not achieved when there is nothing left to add, but when there is nothing left to take away.

Back to top

Q: Why text protocol and binary protocol use different tags in chapter Extended Tags?

For text protocol, field name is unique, so there is no need for tag like field_id. But default_value is frequently used, especially in configuration.
For binary protocol, considering the data compression and the performance of encoding and decoding, it's improper to use field name as unique identifier. So we need tag like field_id.

Back to top

LICENSE

YAS is licensed under New BSD License, a very flexible license to use.

Back to top

Author

Back to top

Implements

Back to top

Releases

No releases published

Packages

No packages published