Skip to content

YujiSODE/txtStat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

76 Commits
 
 
 
 
 
 
 
 

Repository files navigation

txtStat

the interface for text character analysis.
https://github.com/YujiSODE/txtStat

Copyright (c) 2017 Yuji SODE <[email protected]>
This software is released under the MIT License.
See LICENSE or http://opensource.org/licenses/mit-license.php


FunctiontxtStat(FLG95)loads text data and make a graph of frequency against the Unicode codepoints, and returns function that returns character analysis result.
screenshot

Script

  • txtStat.js

How to use

  1. To activate interface:var y=txtStat();
  2. To get character analysis result:y();
    An object is returned as a result.

Interface

  1. Textarea: a text input for analysis.

  2. Color input: a color for a graph.

  3. Load button; it loads text data in "Textarea" as a set of data.

  4. Analyze button; it analyzes all loaded text datasets and make a graph of frequency against the Unicode
    codepoints on canvas tag. Optionally another canvas tag is also available to output.

  5. Clear textarea button; it clears only "Textarea".

  6. Reset loaded data button; it clears only loaded text datasets.

  7. Close button; it closes this interface.

Character analysis result

This is an object returned by a function, which is returned by functiontxtStat(FLG95).
This object has 9 values

  1. data: JSON formatted loaded data with a valueNindicating a cumulative frequency of text datasets.

  2. results: a result of analysis.
    This is an object with 4 values

    • xMax: the max Unicode codepoint value in datasets.
    • xMin: the min Unicode codepoint value in datasets.
    • maxFreq: the max frequency.
    • range: range of datasets.
  3. color: color of graph.

  4. log: a timestamp.

  5. xMax: the max Unicode codepoint value in loaded text datasets.
    This is an object with 4 values(x,x16,char, andxMaxFreq; see 6.xMin).

  6. xMin: the min Unicode codepoint value in loaded text datasets.
    xMaxandxMin are objects with 4 values.

    • x: the Unicode codepoint value.
    • x16: the Unicode codepoint value in hexadecimal.
    • char: a character at the Unicode codepoint value.
    • xMaxFreq|xMinFreq: character frequency.
  7. dx: true scale of x-axis.
    Another width-extended canvas tag is recommended when dx << 1.

  8. dy: true scale of y-axis.

  9. lineWidth: width of bar chart.

Optional settings

  • Setting of functiontxtStat(FLG95)
    FLG95: true|false; 95% of canvas width is shown when FLG95 = true.

  • Setting of another canvas tag to output
    A valid id of another canvas tag can be input when Analyze button is clicked.
    Character analysis result is overwritten by result of another canvas tag when input id is valid.

  • Setting of canvas width
    Width of another canvas tag can be changed when input id is valid.