We obtain our dataset by crawling Etherscan verified contracts, which are real-world smart contracts deployed on Ethereum Mainnet.
Our final dataset contains a total 12,515 smart contacts that have source code and concentrates on eight types of vulnerabilities, namely:
-
Timestamp dependency
-
Block number dependency
-
Dangerous delegatecall
-
Ether frozen
-
Unchecked external call
-
Reentrancy
-
Integer overflow/underflow
-
Dangerous Ether strict equality
The ground truth labels (in the file ground truth label
) of smart contracts in the dataset are confirmed based on defined vulnerability-specific patterns and further manual inspection.