spelling/0000755000176200001440000000000013743135342012074 5ustar liggesusersspelling/NAMESPACE0000644000176200001440000000040013640214256013303 0ustar liggesusers# Generated by roxygen2: do not edit by hand S3method(print,summary_spellcheck) export(get_wordlist) export(spell_check_files) export(spell_check_package) export(spell_check_setup) export(spell_check_test) export(spell_check_text) export(update_wordlist) spelling/LICENSE0000644000176200001440000000005113640214256013073 0ustar liggesusersYEAR: 2017 COPYRIGHT HOLDER: Jeroen Ooms spelling/man/0000755000176200001440000000000013640214256012645 5ustar liggesusersspelling/man/spell_check_files.Rd0000644000176200001440000000302613640214256016573 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/check-files.R \name{spell_check_files} \alias{spell_check_files} \alias{spell_check_text} \title{Spell Check} \usage{ spell_check_files(path, ignore = character(), lang = "en_US") spell_check_text(text, ignore = character(), lang = "en_US") } \arguments{ \item{path}{path to file or to spell check} \item{ignore}{character vector with words which will be added to the \link[hunspell:dictionary]{hunspell::dictionary}} \item{lang}{set \code{Language} field in \code{DESCRIPTION} e.g. \code{"en-US"} or \code{"en-GB"}. For supporting other languages, see the \href{https://docs.ropensci.org/hunspell/articles/intro.html#hunspell-dictionaries}{hunspell vignette}.} \item{text}{character vector with plain text} } \description{ Perform a spell check on document files or plain text. } \details{ This function parses a file based on the file extension, and checks only text fields while ignoring code chunks and meta data. It works particularly well for markdown, but also latex, html, xml, pdf, and plain text are supported. For more information about the underlying spelling engine, see the \href{https://docs.ropensci.org/hunspell/articles/intro.html#hunspell-dictionaries}{hunspell package}. } \examples{ # Example files files <- list.files(system.file("examples", package = "knitr"), pattern = "\\\\.(Rnw|Rmd|html)$", full.names = TRUE) spell_check_files(files) } \seealso{ Other spelling: \code{\link{spell_check_package}()}, \code{\link{wordlist}} } \concept{spelling} spelling/man/wordlist.Rd0000644000176200001440000000263413640214256015010 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/wordlist.R \name{wordlist} \alias{wordlist} \alias{update_wordlist} \alias{get_wordlist} \title{The WORDLIST file} \usage{ update_wordlist(pkg = ".", vignettes = TRUE, confirm = TRUE) get_wordlist(pkg = ".") } \arguments{ \item{pkg}{path to package root directory containing the \code{DESCRIPTION} file} \item{vignettes}{check all \code{rmd} and \code{rnw} files in the pkg root directory (e.g. \code{readme.md}) and package \code{vignettes} folder.} \item{confirm}{show changes and ask confirmation before adding new words to the list} } \description{ The package wordlist file is used to allow custom words which will be added to the dictionary when spell checking. It is stored in \code{inst/WORDLIST} in the source package and must contain one word per line in UTF-8 encoded text. } \details{ The \link{update_wordlist} function runs a full spell check on a package, shows the results, and then prompts to add the found words to the package wordlist. Obviously you should check closely that these legitimate words and not actual spelling errors. It also removes words from the wordlist that no longer appear as spelling errors, either because they have been removed from the documentation or added to the \code{lang} dictionary. } \seealso{ Other spelling: \code{\link{spell_check_files}()}, \code{\link{spell_check_package}()} } \concept{spelling} spelling/man/spell_check_package.Rd0000644000176200001440000000464513640214256017074 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/spell-check.R \name{spell_check_package} \alias{spell_check_package} \alias{spelling} \alias{spell_check_setup} \alias{spell_check_test} \title{Package Spell Checking} \usage{ spell_check_package(pkg = ".", vignettes = TRUE, use_wordlist = TRUE) spell_check_setup(pkg = ".", vignettes = TRUE, lang = "en-US", error = FALSE) } \arguments{ \item{pkg}{path to package root directory containing the \code{DESCRIPTION} file} \item{vignettes}{check all \code{rmd} and \code{rnw} files in the pkg root directory (e.g. \code{readme.md}) and package \code{vignettes} folder.} \item{use_wordlist}{ignore words in the package \link[=get_wordlist]{WORDLIST} file} \item{lang}{set \code{Language} field in \code{DESCRIPTION} e.g. \code{"en-US"} or \code{"en-GB"}. For supporting other languages, see the \href{https://docs.ropensci.org/hunspell/articles/intro.html#hunspell-dictionaries}{hunspell vignette}.} \item{error}{should \verb{CMD check} fail if spelling errors are found? Default only prints results.} } \description{ Automatically spell-check package description, documentation, and vignettes. } \details{ Parses and checks R manual pages, rmd/rnw vignettes, and text fields in the package \code{DESCRIPTION} file. The preferred spelling language (typically \code{en-GB} or \code{en-US}) should be specified in the \code{Language} field from your package \code{DESCRIPTION}. To whitelist custom words use the package \link[=get_wordlist]{WORDLIST} file which will be added to the dictionary when spell checking. See \link{update_wordlist} to automatically populate and update this file. The \link{spell_check_setup} function adds a unit test to your package which automatically runs a spell check on documentation and vignettes during \verb{R CMD check} if the environment variable \code{NOT_CRAN} is set to \code{TRUE}. By default this unit test never fails; it merely prints potential spelling errors to the console. If not already done, the \link{spell_check_setup} function will add \code{spelling} as a \code{Suggests} dependency, and a \code{Language} field to \code{DESCRIPTION}. Hunspell includes dictionaries for \code{en_US} and \code{en_GB} by default. Other languages require installation of a custom dictionary, see \link[hunspell:hunspell]{hunspell} for details. } \seealso{ Other spelling: \code{\link{spell_check_files}()}, \code{\link{wordlist}} } \concept{spelling} spelling/DESCRIPTION0000644000176200001440000000245613743135342013611 0ustar liggesusersPackage: spelling Title: Tools for Spell Checking in R Version: 2.2 Authors@R: c( person("Jeroen", "Ooms", , "jeroen@berkeley.edu", role = c("cre", "aut"), comment = c(ORCID = "0000-0002-4035-0289")), person("Jim", "Hester", , "james.hester@rstudio.com", role = "aut")) Description: Spell checking common document formats including latex, markdown, manual pages, and description files. Includes utilities to automate checking of documentation and vignettes as a unit test during 'R CMD check'. Both British and American English are supported out of the box and other languages can be added. In addition, packages may define a 'wordlist' to allow custom terminology without having to abuse punctuation. License: MIT + file LICENSE Encoding: UTF-8 LazyData: true URL: https://docs.ropensci.org/spelling/ (website) https://github.com/ropensci/spelling (devel) BugReports: https://github.com/ropensci/spelling/issues Imports: commonmark, xml2, hunspell (>= 3.0), knitr Suggests: pdftools RoxygenNote: 6.1.99.9001 Language: en-GB NeedsCompilation: no Packaged: 2020-10-18 21:20:24 UTC; jeroen Author: Jeroen Ooms [cre, aut] (), Jim Hester [aut] Maintainer: Jeroen Ooms Repository: CRAN Date/Publication: 2020-10-18 22:00:02 UTC spelling/tests/0000755000176200001440000000000013640214256013234 5ustar liggesusersspelling/tests/spelling.R0000644000176200001440000000007413640214256015175 0ustar liggesusersspelling::spell_check_test(vignettes = TRUE, error = FALSE) spelling/NEWS0000644000176200001440000000266013742570571012605 0ustar liggesusers2.2 - WORDLIST is now sorted with a locale-independent method, which avoids large diffs in version control due to the fact that developpers use different locale, with different lexicographic ordering rules (#48, @bisaloo) - spell_check_package() now loads Rd macros (#42) 2.1 - Pre-filter script/style/img tags when checking html files because the huge embedded binary blogs produced by rmarkdown slow down the hunspell parser. - Treat input files in spell_check_files() as UTF-8 on all platforms - Fix a sorting bug in spell_check_files() 2.0 - spell_check_package() now also checks README.md and NEWS.md in the package root - Enforce latest hunspell and libhunspell, which include updated dictionaries - Treat all input as UTF-8. Fixes some false positives on Windows - Ignore yaml front matter in markdown except for 'title', 'subtitle', and 'description' - Markdown: filter words that contain an '@' symbol (citation key or email address) - Properly parse authors@R field for ignore list (issue #2) - Use tools::file_ext instead of knitr:::file_ext 1.2 - Internally normalize all case of lang strings to lower_UPPER e.g en_US - Only run automatic check when 'spelling' is available and NOT_CRAN is set 1.1 - Breaking: Package spell-checker now uses language from DESCRIPTION - Require hunspell 2.9 dependency (better parsing and dicationaries) - Change default lang to 'en_US' 1.0 - Initial release spelling/R/0000755000176200001440000000000013742570571012303 5ustar liggesusersspelling/R/rmarkdown.R0000644000176200001440000000211413640214256014420 0ustar liggesusers# This is borrowed from the rmarkdown pkg partition_yaml_front_matter <- function (input_lines) { validate_front_matter <- function(delimiters) { if (length(delimiters) >= 2 && (delimiters[2] - delimiters[1] > 1) && grepl("^---\\s*$", input_lines[delimiters[1]])) { if (delimiters[1] == 1) TRUE else is_blank(input_lines[1:delimiters[1] - 1]) } else { FALSE } } delimiters <- grep("^(---|\\.\\.\\.)\\s*$", input_lines) if (validate_front_matter(delimiters)) { front_matter <- input_lines[(delimiters[1]):(delimiters[2])] input_body <- c() if (delimiters[1] > 1) input_body <- c(input_body, input_lines[1:delimiters[1] - 1]) if (delimiters[2] < length(input_lines)) input_body <- c(input_body, input_lines[-(1:delimiters[2])]) list(front_matter = front_matter, body = input_body) } else { list(front_matter = NULL, body = input_lines) } } is_blank <- function(x) { if (length(x)) all(grepl("^\\s*$", x)) else TRUE } spelling/R/check-files.R0000644000176200001440000001332513640214256014577 0ustar liggesusers#' Spell Check #' #' Perform a spell check on document files or plain text. #' #' This function parses a file based on the file extension, and checks only #' text fields while ignoring code chunks and meta data. It works particularly #' well for markdown, but also latex, html, xml, pdf, and plain text are #' supported. #' #' For more information about the underlying spelling engine, see the #' [hunspell package](https://docs.ropensci.org/hunspell/articles/intro.html#hunspell-dictionaries). #' #' @rdname spell_check_files #' @family spelling #' @inheritParams spell_check_package #' @param path path to file or to spell check #' @param ignore character vector with words which will be added to the [hunspell::dictionary] #' @export #' @examples # Example files #' files <- list.files(system.file("examples", package = "knitr"), #' pattern = "\\.(Rnw|Rmd|html)$", full.names = TRUE) #' spell_check_files(files) spell_check_files <- function(path, ignore = character(), lang = "en_US"){ stopifnot(is.character(ignore)) lang <- normalize_lang(lang) dict <- hunspell::dictionary(lang, add_words = ignore) path <- sort(normalizePath(path, mustWork = TRUE)) lines <- lapply(path, spell_check_file_one, dict = dict) summarize_words(path, lines) } spell_check_file_one <- function(path, dict){ if(grepl("\\.r?md$",path, ignore.case = TRUE)) return(spell_check_file_md(path, dict = dict)) if(grepl("\\.rd$", path, ignore.case = TRUE)) return(spell_check_file_rd(path, dict = dict)) if(grepl("\\.(rnw|snw)$",path, ignore.case = TRUE)) return(spell_check_file_knitr(path = path, format = "latex", dict = dict)) if(grepl("\\.(tex)$",path, ignore.case = TRUE)) return(spell_check_file_plain(path = path, format = "latex", dict = dict)) if(grepl("\\.(html?)$", path, ignore.case = TRUE)){ try({ path <- pre_filter_html(path) }) return(spell_check_file_plain(path = path, format = "html", dict = dict)) } if(grepl("\\.(xml)$",path, ignore.case = TRUE)) return(spell_check_file_plain(path = path, format = "xml", dict = dict)) if(grepl("\\.(pdf)$",path, ignore.case = TRUE)) return(spell_check_file_pdf(path = path, format = "text", dict = dict)) return(spell_check_file_plain(path = path, format = "text", dict = dict)) } #' @rdname spell_check_files #' @export #' @param text character vector with plain text spell_check_text <- function(text, ignore = character(), lang = "en_US"){ stopifnot(is.character(ignore)) lang <- normalize_lang(lang) dict <- hunspell::dictionary(lang, add_words = ignore) bad_words <- hunspell::hunspell(text, dict = dict) words <- sort(unique(unlist(bad_words))) out <- data.frame(word = words, stringsAsFactors = FALSE) out$found <- lapply(words, function(word) { which(vapply(bad_words, `%in%`, x = word, logical(1))) }) out } spell_check_plain <- function(text, dict){ bad_words <- hunspell::hunspell(text, dict = dict) vapply(sort(unique(unlist(bad_words))), function(word) { line_numbers <- which(vapply(bad_words, `%in%`, x = word, logical(1))) paste(line_numbers, collapse = ",") }, character(1)) } spell_check_file_text <- function(file, dict){ spell_check_plain(readLines(file), dict = dict) } spell_check_description_text <- function(file, dict){ lines <- readLines(file) lines <- gsub("", "", lines) spell_check_plain(lines, dict = dict) } spell_check_file_rd <- function(rdfile, macros = NULL, dict) { text <- if (!length(macros)) { tools::RdTextFilter(rdfile) } else { tools::RdTextFilter(rdfile, macros = macros) } Encoding(text) <- "UTF-8" spell_check_plain(text, dict = dict) } spell_check_file_md <- function(path, dict){ words <- parse_text_md(path) # Filter out citation keys, see https://github.com/ropensci/spelling/issues/9 words$text <- gsub("\\S*@\\S+", "", words$text, perl = TRUE) words$startline <- vapply(strsplit(words$position, ":", fixed = TRUE), `[[`, character(1), 1) bad_words <- hunspell::hunspell(words$text, dict = dict) vapply(sort(unique(unlist(bad_words))), function(word) { line_numbers <- which(vapply(bad_words, `%in%`, x = word, logical(1))) paste(words$startline[line_numbers], collapse = ",") }, character(1)) } spell_check_file_knitr <- function(path, format, dict){ latex <- remove_chunks(path) words <- hunspell::hunspell_parse(latex, format = format, dict = dict) text <- vapply(words, paste, character(1), collapse = " ") spell_check_plain(text, dict = dict) } spell_check_file_plain <- function(path, format, dict){ lines <- readLines(path, warn = FALSE, encoding = 'UTF-8') words <- hunspell::hunspell_parse(lines, format = format, dict = dict) text <- vapply(words, paste, character(1), collapse = " ") spell_check_plain(text, dict = dict) } spell_check_file_pdf <- function(path, format, dict){ lines <- pdftools::pdf_text(path) words <- hunspell::hunspell_parse(lines, format = format, dict = dict) text <- vapply(words, paste, character(1), collapse = " ") spell_check_plain(text, dict = dict) } # TODO: this does not retain whitespace in DTD before the tag pre_filter_html <- function(path){ doc <- xml2::read_html(path, options = c("RECOVER", "NOERROR")) src_nodes <- xml2::xml_find_all(doc, ".//*[@src]") xml2::xml_set_attr(src_nodes, 'src', replace_text(xml2::xml_attr(src_nodes, 'src'))) script_nodes <- xml2::xml_find_all(doc, "(.//script|.//style)") xml2::xml_set_text(script_nodes, replace_text(xml2::xml_text(script_nodes))) tmp <- file.path(tempdir(), basename(path)) unlink(tmp) xml2::write_html(doc, tmp, options = 'format_whitespace') return(tmp) } # This replaces all text except for linebreaks. # Therefore line numbers in spelling output should be unaffected replace_text <- function(x){ gsub(".*", "", x, perl = TRUE) } spelling/R/parse-markdown.R0000644000176200001440000000316713640214256015357 0ustar liggesusers#' Text Parsers #' #' Parse text from various formats and return a data frame with text lines #' and position in the source document. #' #' @noRd #' @name parse_text #' @param path markdown file #' @param yaml_fields character vector indicating which fields of the yaml #' front matter should be spell checked. #' @param extensions render markdown extensions? Passed to [commonmark][commonmark::markdown_xml] parse_text_md <- function(path, extensions = TRUE, yaml_fields = c("title" ,"subtitle", "description")){ # Read file and remove yaml font matter text <- readLines(path, warn = FALSE, encoding = 'UTF-8') parts <- partition_yaml_front_matter(text) if(length(parts$front_matter)){ yaml_fields <- paste(yaml_fields, collapse = "|") has_field <- grepl(paste0("^\\s*(",yaml_fields, ")"), parts$front_matter, ignore.case = TRUE) text[which(!has_field)] <- "" } # Get markdown AST as xml doc md <- commonmark::markdown_xml(text, sourcepos = TRUE, extensions = extensions) doc <- xml2::xml_ns_strip(xml2::read_xml(md)) # Find text nodes and their location in the markdown source doc sourcepos_nodes <- xml2::xml_find_all(doc, "//*[@sourcepos][text]") sourcepos <- xml2::xml_attr(sourcepos_nodes, "sourcepos") values <- vapply(sourcepos_nodes, function(x) { paste0(collapse = "\n", xml2::xml_text(xml2::xml_find_all(x, "./text"))) }, character(1)) # Strip 'heading identifiers', see: https://pandoc.org/MANUAL.html#heading-identifiers values <- gsub('\\{#[^\\n]+\\}\\s*($|\\r?\\n)', '\\1', values, perl = TRUE) data.frame( text = values, position = sourcepos, stringsAsFactors = FALSE ) } spelling/R/wordlist.R0000644000176200001440000000512713742570571014302 0ustar liggesusers#' The WORDLIST file #' #' The package wordlist file is used to allow custom words which will be added to the #' dictionary when spell checking. It is stored in `inst/WORDLIST` in the source package #' and must contain one word per line in UTF-8 encoded text. #' #' The [update_wordlist] function runs a full spell check on a package, shows the results, #' and then prompts to add the found words to the package wordlist. Obviously you should #' check closely that these legitimate words and not actual spelling errors. It also #' removes words from the wordlist that no longer appear as spelling errors, either because #' they have been removed from the documentation or added to the `lang` dictionary. #' #' @rdname wordlist #' @name wordlist #' @family spelling #' @export #' @param confirm show changes and ask confirmation before adding new words to the list #' @inheritParams spell_check_package update_wordlist <- function(pkg = ".", vignettes = TRUE, confirm = TRUE){ pkg <- as_package(pkg) wordfile <- get_wordfile(pkg$path) old_words <- sort(get_wordlist(pkg$path), method = "radix") new_words <- sort(spell_check_package(pkg$path, vignettes = vignettes, use_wordlist = FALSE)$word, method = "radix") if(isTRUE(all.equal(old_words, new_words))){ cat(sprintf("No changes required to %s\n", wordfile)) } else { words_added <- new_words[is.na(match(new_words, old_words))] words_removed <- old_words[is.na(match(old_words, new_words))] if(length(words_added)){ cat(sprintf("The following words will be added to the wordlist:\n%s\n", paste(" -", words_added, collapse = "\n"))) } if(length(words_removed)){ cat(sprintf("The following words will be removed from the wordlist:\n%s\n", paste(" -", words_removed, collapse = "\n"))) } if(isTRUE(confirm) && length(words_added)){ cat("Are you sure you want to update the wordlist?") if (utils::menu(c("Yes", "No")) != 1){ return(invisible()) } } # Save as UTF-8 dir.create(dirname(wordfile), showWarnings = FALSE) writeLines(enc2utf8(new_words), wordfile, useBytes = TRUE) cat(sprintf("Added %d and removed %d words in %s\n", length(words_added), length(words_removed), wordfile)) } } #' @rdname wordlist #' @export get_wordlist <- function(pkg = "."){ pkg <- as_package(pkg) wordfile <- get_wordfile(pkg$path) out <- if(file.exists(wordfile)) unlist(strsplit(readLines(wordfile, warn = FALSE, encoding = "UTF-8"), " ", fixed = TRUE)) as.character(out) } get_wordfile <- function(path){ normalizePath(file.path(path, "inst/WORDLIST"), mustWork = FALSE) } spelling/R/spell-check.R0000644000176200001440000002153613640214256014617 0ustar liggesusers#' Package Spell Checking #' #' Automatically spell-check package description, documentation, and vignettes. #' #' Parses and checks R manual pages, rmd/rnw vignettes, and text fields in the #' package `DESCRIPTION` file. #' #' The preferred spelling language (typically `en-GB` or `en-US`) should be specified #' in the `Language` field from your package `DESCRIPTION`. To whitelist custom words #' use the package [WORDLIST][get_wordlist] file which will be added to the dictionary #' when spell checking. See [update_wordlist] to automatically populate and update this #' file. #' #' The [spell_check_setup] function adds a unit test to your package which automatically #' runs a spell check on documentation and vignettes during `R CMD check` if the environment #' variable `NOT_CRAN` is set to `TRUE`. By default this unit test never fails; it merely #' prints potential spelling errors to the console. If not already done, #' the [spell_check_setup] function will add `spelling` as a `Suggests` dependency, #' and a `Language` field to `DESCRIPTION`. #' #' Hunspell includes dictionaries for `en_US` and `en_GB` by default. Other languages #' require installation of a custom dictionary, see [hunspell][hunspell::hunspell] for details. #' #' @export #' @rdname spell_check_package #' @name spell_check_package #' @aliases spelling #' @family spelling #' @param pkg path to package root directory containing the `DESCRIPTION` file #' @param vignettes check all `rmd` and `rnw` files in the pkg root directory (e.g. #' `readme.md`) and package `vignettes` folder. #' @param use_wordlist ignore words in the package [WORDLIST][get_wordlist] file #' @param lang set `Language` field in `DESCRIPTION` e.g. `"en-US"` or `"en-GB"`. #' For supporting other languages, see the [hunspell vignette](https://docs.ropensci.org/hunspell/articles/intro.html#hunspell-dictionaries). spell_check_package <- function(pkg = ".", vignettes = TRUE, use_wordlist = TRUE){ # Get package info pkg <- as_package(pkg) # Get language from DESCRIPTION lang <- normalize_lang(pkg$language) # Add custom words to the ignore list add_words <- if(isTRUE(use_wordlist)) get_wordlist(pkg$path) author <- if(length(pkg[['authors@r']])){ parse_r_field(pkg[['authors@r']]) } else { strsplit(pkg[['author']], " ", fixed = TRUE)[[1]] } ignore <- unique(c(pkg$package, author, hunspell::en_stats, add_words)) # Create the hunspell dictionary object dict <- hunspell::dictionary(lang, add_words = sort(ignore)) # Check Rd manual files rd_files <- sort(list.files(file.path(pkg$path, "man"), "\\.rd$", ignore.case = TRUE, full.names = TRUE)) macros <- tools::loadRdMacros( file.path(R.home("share"), "Rd", "macros", "system.Rd"), tools::loadPkgRdMacros(pkg$path) ) rd_lines <- lapply(rd_files, spell_check_file_rd, dict = dict, macros = macros) # Check 'DESCRIPTION' fields pkg_fields <- c("title", "description") pkg_lines <- lapply(pkg_fields, function(x){ spell_check_description_text(textConnection(pkg[[x]]), dict = dict) }) # Combine all_sources <- c(rd_files, pkg_fields) all_lines <- c(rd_lines, pkg_lines) if(isTRUE(vignettes)){ # Where to check for rmd/md files vign_files <- list.files(file.path(pkg$path, "vignettes"), pattern = "\\.r?md$", ignore.case = TRUE, full.names = TRUE, recursive = TRUE) root_files <- list.files(pkg$path, pattern = "(readme|news|changes|index).r?md", ignore.case = TRUE, full.names = TRUE) # Markdown vignettes md_files <- normalizePath(c(root_files, vign_files)) md_lines <- lapply(sort(md_files), spell_check_file_md, dict = dict) # Sweave vignettes rnw_files <- list.files(file.path(pkg$path, "vignettes"), pattern = "\\.[rs]nw$", ignore.case = TRUE, full.names = TRUE) rnw_lines <- lapply(sort(rnw_files), spell_check_file_knitr, format = "latex", dict = dict) # Combine all_sources <- c(all_sources, md_files, rnw_files) all_lines <- c(all_lines, md_lines, rnw_lines) } summarize_words(all_sources, all_lines) } as_package <- function(pkg){ if(inherits(pkg, 'package')) return(pkg) path <- pkg description <- if(file.exists(file.path(path, "DESCRIPTION.in"))){ file.path(path, "DESCRIPTION.in") } else { normalizePath(file.path(path, "DESCRIPTION"), mustWork = TRUE) } pkg <- read.dcf(description)[1,] Encoding(pkg) = "UTF-8" pkg <- as.list(pkg) names(pkg) <- tolower(names(pkg)) pkg$path <- dirname(description) structure(pkg, class = 'package') } # Find all occurences for each word summarize_words <- function(file_names, found_line){ words_by_file <- lapply(found_line, names) bad_words <- sort(unique(unlist(words_by_file))) out <- data.frame( word = bad_words, stringsAsFactors = FALSE ) out$found <- lapply(bad_words, function(word) { index <- which(vapply(words_by_file, `%in%`, x = word, logical(1))) reports <- vapply(index, function(i){ paste0(basename(file_names[i]), ":", found_line[[i]][word]) }, character(1)) }) structure(out, class = c("summary_spellcheck", "data.frame")) } #' @export print.summary_spellcheck <- function(x, ...){ if(!nrow(x)){ cat("No spelling errors found.\n") return(invisible()) } words <- x$word fmt <- paste0("%-", max(nchar(words), 0) + 3, "s") pretty_names <- sprintf(fmt, words) cat(sprintf(fmt, " WORD"), " FOUND IN\n", sep = "") for(i in seq_len(nrow(x))){ cat(pretty_names[i]) cat(paste(x$found[[i]], collapse = paste0("\n", sprintf(fmt, "")))) cat("\n") } invisible(x) } #' @export #' @aliases spell_check_test #' @rdname spell_check_package #' @param error should `CMD check` fail if spelling errors are found? #' Default only prints results. spell_check_setup <- function(pkg = ".", vignettes = TRUE, lang = "en-US", error = FALSE){ # Get package info pkg <- as_package(pkg) lang <- normalize_lang(lang) pkg$language <- lang update_description(pkg, lang = lang) update_wordlist(pkg, vignettes = vignettes) dir.create(file.path(pkg$path, "tests"), showWarnings = FALSE) writeLines(sprintf("if(requireNamespace('spelling', quietly = TRUE)) spelling::spell_check_test(vignettes = %s, error = %s, skip_on_cran = TRUE)", deparse(vignettes), deparse(error)), file.path(pkg$path, "tests/spelling.R")) cat(sprintf("Updated %s\n", file.path(pkg$path, "tests/spelling.R"))) } #' @export spell_check_test <- function(vignettes = TRUE, error = FALSE, lang = NULL, skip_on_cran = TRUE){ if(isTRUE(skip_on_cran)){ not_cran <- Sys.getenv('NOT_CRAN') # See logic in tools:::config_val_to_logical if(is.na(match(tolower(not_cran), c("1", "yes", "true")))) return(NULL) } out_save <- readLines(system.file("templates/spelling.Rout.save", package = 'spelling')) code <- format_syntax(readLines("spelling.R")) out_save <- sub("@INPUT@", code, out_save, fixed = TRUE) writeLines(out_save, "spelling.Rout.save") # Try to find pkg source directory pkg_dir <- list.files("../00_pkg_src", full.names = TRUE) if(!length(pkg_dir)){ # This is where it is on e.g. win builder check_dir <- dirname(getwd()) if(grepl("\\.Rcheck$", check_dir)){ source_dir <- sub("\\.Rcheck$", "", check_dir) if(file.exists(source_dir)) pkg_dir <- source_dir } } if(!length(pkg_dir)){ warning("Failed to find package source directory") return(invisible()) } results <- spell_check_package(pkg_dir, vignettes = vignettes) if(nrow(results)){ if(isTRUE(error)){ output <- sprintf("Potential spelling errors: %s\n", paste(results$word, collapse = ", ")) stop(output, call. = FALSE) } else { cat("Potential spelling errors:\n") print(results) } } cat("All Done!\n") } update_description <- function(pkg, lang = NULL){ desc <- normalizePath(file.path(pkg$path, "DESCRIPTION"), mustWork = TRUE) lines <- readLines(desc, warn = FALSE) if(!any(grepl("spelling", c(pkg$package, pkg$suggests, pkg$imports, pkg$depends)))){ lines <- if(!any(grepl("^Suggests", lines))){ c(lines, "Suggests:\n spelling") } else { sub("^Suggests:", "Suggests:\n spelling,", lines) } } is_lang <- grepl("^Language:", lines, ignore.case = TRUE) isolang <- gsub("_", "-", lang, fixed = TRUE) if(any(is_lang)){ is_lang <- which(grepl("^Language:", lines)) lines[is_lang] <- paste("Language:", isolang) } else { message(sprintf("Adding 'Language: %s' to DESCRIPTION", isolang)) lines <- c(lines, paste("Language:", isolang)) } writeLines(lines, desc) } format_syntax <- function(txt){ pt <- getOption('prompt') ct <- getOption('continue') prefix <- c(pt, rep(ct, length(txt) - 1)) paste(prefix, txt, collapse = "\n", sep = "") } parse_r_field <- function(txt){ tryCatch({ info <- eval(parse(text = txt)) unlist(info, recursive = TRUE, use.names = FALSE) }, error = function(e){ NULL }) } spelling/R/language.R0000644000176200001440000000150413640214256014201 0ustar liggesusers# Very simple right now: # Convert dashes to underscore # Convert 'en' to 'en_US' # Convert e.g. 'de' to 'de_DE' normalize_lang <- function(lang = NULL){ if(!length(lang) || !nchar(lang)){ message("DESCRIPTION does not contain 'Language' field. Defaulting to 'en-US'.") lang <- "en-US" } if(tolower(lang) == "en" || tolower(lang) == "eng"){ message("Found ambiguous language 'en'. Defaulting to 'en-US") lang <- "en-US" } if(nchar(lang) == 2){ oldlang <- lang lang <- paste(tolower(lang), toupper(lang), sep = "_") message(sprintf("Found ambiguous language '%s'. Defaulting to '%s'", oldlang, lang)) } lang <- gsub("-", "_", lang, fixed = TRUE) parts <- strsplit(lang, "_", fixed = TRUE)[[1]] parts[1] <- tolower(parts[1]) parts[-1] <- toupper(parts[-1]) paste(parts, collapse = "_") } spelling/R/remove-chunks.R0000644000176200001440000000207113640214256015204 0ustar liggesusers# Adapted from lintr:::extract_r_source remove_chunks <- function(path) { path <- normalizePath(path, mustWork = TRUE) filename <- basename(path) lines <- readLines(path, encoding = 'UTF-8') pattern <- get_knitr_pattern(filename, lines) if (is.null(pattern$chunk.begin) || is.null(pattern$chunk.end)) { return(lines) } starts <- grep(pattern$chunk.begin, lines, perl = TRUE) ends <- grep(pattern$chunk.end, lines, perl = TRUE) # no chunks found, so just return the lines if (length(starts) == 0 || length(ends) == 0) { return(lines) } # Find first ending after a start seqs <- lapply(starts, function(start){ end <- sort(ends[ends > start])[1] if(!is.na(end)) seq(start, end) }) lines[unlist(seqs)] = "" return(lines) } detect_pattern <- function(...){ utils::getFromNamespace('detect_pattern', 'knitr')(...) } get_knitr_pattern <- function(filename, lines) { pattern <- detect_pattern(lines, tolower(tools::file_ext(filename))) if (!is.null(pattern)) { knitr::all_patterns[[pattern]] } else { NULL } } spelling/MD50000644000176200001440000000152713743135342012411 0ustar liggesusersbf01b4034ba41686c1f79d3dcacbc73f *DESCRIPTION 1ee0683cce6d3479250337954c075d63 *LICENSE 151c78c6510bc95a7ee0dda1c7324701 *NAMESPACE 4d1a412ac430720d39d8ffc2a47399c8 *NEWS ea5c60448d6ba78b3c44133db8e3cb15 *R/check-files.R fec788201862967ee85f4fa838596c41 *R/language.R c8c961c8e5cb90ffc1c6cc7a3e1624af *R/parse-markdown.R 84ab4c59719bd93da028ae5b0d6e7868 *R/remove-chunks.R 7169db7d9fded57a654275109db0d3f6 *R/rmarkdown.R 8529536e46e8f4890db645ce7cd82847 *R/spell-check.R a27c86dc3e3f859f104517ae03a21688 *R/wordlist.R 80f21503dddc391bb76ee403b82031a2 *inst/WORDLIST 221eb7751a0c8a4ef7b3ec62a43b6f0c *inst/templates/spelling.Rout.save 9b5fe4c2f2319f4b6a693fd6f61be8c8 *man/spell_check_files.Rd 34c646f68b45044df800a7a478859de0 *man/spell_check_package.Rd 0126a5695fde8c182806eb06dc01ace7 *man/wordlist.Rd bc882e5235bfdccc9e49f99290365b40 *tests/spelling.R spelling/inst/0000755000176200001440000000000013742570571013057 5ustar liggesusersspelling/inst/templates/0000755000176200001440000000000013640214256015045 5ustar liggesusersspelling/inst/templates/spelling.Rout.save0000644000176200001440000000131413640214256020471 0ustar liggesusers R version 3.4.1 (2017-06-30) -- "Single Candle" Copyright (C) 2017 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin15.6.0 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. @INPUT@ All Done! > > proc.time() user system elapsed 0.372 0.039 0.408 spelling/inst/WORDLIST0000644000176200001440000000006313742570571014250 0ustar liggesusersAppVeyor CMD RStudio devtools hunspell pkg rmd rnw