Skip to content

HamzaAbubakar/sindhi-NLP-dataset

 
 

Repository files navigation

Sindh NLP Dataset

The Repository is the collection of sindh text datasets from tweets to articles to books. The purpose of this repository is to provide an authentic datasource for development and research in the field of Sindhi NLP (Natural language Processing).

Description of the data

The dataset in this repository are arranged in such a way there is folder for each dataset that will contain one or more than one csv files and a readme file for each dataset describing everything about the dataset.

root/
    dataset_1/
        -dataset.csv
        -README.md
    dataset_2/
        dataset2.csv    
        -README.md 
    ...
    dataset_n
        dataset_n.csv
        README.md 
  -README.md

And file formats

This repository will contain file with csv and md format. .

Dataset File, format CSV.

What you can Contribute

We believe that there is nothing to small to contribute even if it is correcting the typo. We encourged everyone to contribute no matter if you're a newbie or experienced github contributor. Followinga are some of the ideas where you can contribute

  1. Publish a new sindhi dataset which is not available in our repositories
  2. Label the dataset
  3. Documentation
  4. Translate documentation to sindhi
  5. Clean the existing dataset
  6. Create Sample Notebooks on the existing datasets

How to contribute

Here you will learn how you can contribute to this project and can make your impact

1. Fork this repository

You can fork this repository by clicking on fork button on top right corner. Once you fork this will create a copy of repo on your account

2. Clone the repository

To clone the repository go to your account open this repo and either click on clone button or run the command below to get this repository on your local machine

git clone "URL you just copied"

e.g. git clone https://github.com/yourgithubusername/sindhi-NLP-dataset.git

3. Create a branch

On your local machine go the project folder that you cloned and use following git command inside that folder

create a new branch using below command

git checkout -b

e.g. git checkout -b owais431

4. Lets make some contributions

Make whatever contribution you want to make. We believe that there is nothing to small to contribute even if it is correcting the typo. We encourged everyone both expereinced github contributors as well as newbie to contributing as much as possible.

5. Add Changes and Commit Changes

Now we have to add changes that we made to the branch so for that we will run following command

git add .

Now we have to commit changes, commit message should always be clear, to commit use command below

git commit -m "clear-commit-message-to-show-what-you-did"

6. Push changes to GitHub

Now you have to push the changes that you made to remote repository on specified branch to do so use command below

git push origin name-of-your-branch

name of branch is same as you created in step 3

Submit your changes for review

Once you have pushed your code to GitHub, now it's time to create pull request, you will go to the repository click on compare and pull request and submit the pull request.

Soon, we will be merging all your pull requests to the main branch of project and you will also get notification once your pull request is merged with existing code base.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%