Skip to content

R wrapper of lynx browser to extract links from web pages

Notifications You must be signed in to change notification settings

Arf9999/linkXtractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

linkXtractor

The goal of linkXtractor is to build a tibble of links from a source webpage URL.

Installation

You can install development version from GitHub with:

# install.packages("devtools")
devtools::install_github("Arf9999/linkXtractor")

Example

This is a basic example:

library(linkXtractor)
example <- linkXtractor::get_links("https://cnn.com/")
head(example, 10)
#> # A tibble: 10 x 2
#>    source_url       out_link                                                    
#>    <chr>            <chr>                                                       
#>  1 https://cnn.com/ https://edition.cnn.com/                                    
#>  2 https://cnn.com/ https://www.cnn.com/                                        
#>  3 https://cnn.com/ https://plus.google.com/+cnn/posts                          
#>  4 https://cnn.com/ android-app://com.cnn.mobile.android.phone/http/edition.cnn…
#>  5 https://cnn.com/ https://edition.cnn.com/world                               
#>  6 https://cnn.com/ https://edition.cnn.com/politics                            
#>  7 https://cnn.com/ https://edition.cnn.com/business                            
#>  8 https://cnn.com/ https://edition.cnn.com/health                              
#>  9 https://cnn.com/ https://edition.cnn.com/entertainment                       
#> 10 https://cnn.com/ https://edition.cnn.com/style
## basic example code

About

R wrapper of lynx browser to extract links from web pages

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages