route-recognizer-0.3.1/.cargo_vcs_info.json 0000644 00000000136 00000000001 0014340 0 ustar {
"git": {
"sha1": "14b94114a83a4e0d4398bce2e77dc501f0f7ab1a"
},
"path_in_vcs": ""
} route-recognizer-0.3.1/.github/CODE_OF_CONDUCT 0000644 0000000 0000000 00000006265 00726746425 0016731 0 ustar 0000000 0000000 # Contributor Covenant Code of Conduct
## Our Pledge
In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to making participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, gender identity and expression, level of
experience,
education, socio-economic status, nationality, personal appearance, race,
religion, or sexual identity and orientation.
## Our Standards
Examples of behavior that contributes to creating a positive environment
include:
- Using welcoming and inclusive language
- Being respectful of differing viewpoints and experiences
- Gracefully accepting constructive criticism
- Focusing on what is best for the community
- Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
- The use of sexualized language or imagery and unwelcome sexual attention or
advances
- Trolling, insulting/derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others' private information, such as a physical or electronic
address, without explicit permission
- Other conduct which could reasonably be considered inappropriate in a
professional setting
## Our Responsibilities
Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.
## Scope
This Code of Conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community. Examples of
representing a project or community include using an official project e-mail
address, posting via an official social media account, or acting as an appointed
representative at an online or offline event. Representation of a project may be
further defined and clarified by project maintainers.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at yoshuawuyts@gmail.com, or through
IRC. All complaints will be reviewed and investigated and will result in a
response that is deemed necessary and appropriate to the circumstances. The
project team is obligated to maintain confidentiality with regard to the
reporter of an incident.
Further details of specific enforcement policies may be posted separately.
Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.
## Attribution
This Code of Conduct is adapted from the Contributor Covenant, version 1.4,
available at
https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
route-recognizer-0.3.1/.github/CONTRIBUTING.md 0000644 0000000 0000000 00000005266 00726746425 0016762 0 ustar 0000000 0000000 # Contributing
Contributions include code, documentation, answering user questions, running the
project's infrastructure, and advocating for all types of users.
The project welcomes all contributions from anyone willing to work in good faith
with other contributors and the community. No contribution is too small and all
contributions are valued.
This guide explains the process for contributing to the project's GitHub
Repository.
- [Code of Conduct](#code-of-conduct)
- [Bad Actors](#bad-actors)
## Code of Conduct
The project has a [Code of Conduct](./CODE_OF_CONDUCT.md) that *all*
contributors are expected to follow. This code describes the *minimum* behavior
expectations for all contributors.
As a contributor, how you choose to act and interact towards your
fellow contributors, as well as to the community, will reflect back not only
on yourself but on the project as a whole. The Code of Conduct is designed and
intended, above all else, to help establish a culture within the project that
allows anyone and everyone who wants to contribute to feel safe doing so.
Should any individual act in any way that is considered in violation of the
[Code of Conduct](./CODE_OF_CONDUCT.md), corrective actions will be taken. It is
possible, however, for any individual to *act* in such a manner that is not in
violation of the strict letter of the Code of Conduct guidelines while still
going completely against the spirit of what that Code is intended to accomplish.
Open, diverse, and inclusive communities live and die on the basis of trust.
Contributors can disagree with one another so long as they trust that those
disagreements are in good faith and everyone is working towards a common
goal.
## Bad Actors
All contributors to tacitly agree to abide by both the letter and
spirit of the [Code of Conduct](./CODE_OF_CONDUCT.md). Failure, or
unwillingness, to do so will result in contributions being respectfully
declined.
A *bad actor* is someone who repeatedly violates the *spirit* of the Code of
Conduct through consistent failure to self-regulate the way in which they
interact with other contributors in the project. In doing so, bad actors
alienate other contributors, discourage collaboration, and generally reflect
poorly on the project as a whole.
Being a bad actor may be intentional or unintentional. Typically, unintentional
bad behavior can be easily corrected by being quick to apologize and correct
course *even if you are not entirely convinced you need to*. Giving other
contributors the benefit of the doubt and having a sincere willingness to admit
that you *might* be wrong is critical for any successful open collaboration.
Don't be a bad actor.
route-recognizer-0.3.1/.github/workflows/ci.yaml 0000644 0000000 0000000 00000002205 00726746425 0020033 0 ustar 0000000 0000000 name: CI
on:
pull_request:
push:
branches:
- main
- staging
- trying
env:
RUSTFLAGS: -Dwarnings
jobs:
build_and_test:
name: Build and test
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, windows-latest, macOS-latest]
rust: [stable]
steps:
- uses: actions/checkout@master
- name: Install ${{ matrix.rust }}
uses: actions-rs/toolchain@v1
with:
toolchain: ${{ matrix.rust }}
override: true
- name: check
uses: actions-rs/cargo@v1
with:
command: check
args: --all --bins --examples
- name: tests
uses: actions-rs/cargo@v1
with:
command: test
args: --all
check_fmt_and_docs:
name: Checking fmt and docs
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- uses: actions-rs/toolchain@v1
with:
toolchain: nightly
components: rustfmt, clippy
override: true
- name: fmt
run: cargo fmt --all -- --check
- name: Docs
run: cargo doc
route-recognizer-0.3.1/.gitignore 0000644 0000000 0000000 00000000026 00726746425 0015146 0 ustar 0000000 0000000 /target
/Cargo.lock
route-recognizer-0.3.1/Cargo.toml 0000644 00000001375 00000000001 0012344 0 ustar # THIS FILE IS AUTOMATICALLY GENERATED BY CARGO
#
# When uploading crates to the registry Cargo will automatically
# "normalize" Cargo.toml files for maximal compatibility
# with all versions of Cargo and also rewrite `path` dependencies
# to registry (e.g., crates.io) dependencies.
#
# If you are reading this file be aware that the original Cargo.toml
# will likely look very different (and much more reasonable).
# See Cargo.toml.orig for the original contents.
[package]
edition = "2018"
name = "route-recognizer"
version = "0.3.1"
authors = ["wycats", "rustasync"]
description = "Recognizes URL patterns with support for dynamic and wildcard segments"
keywords = ["router", "url"]
license = "MIT"
repository = "https://github.com/rustasync/route-recognizer"
route-recognizer-0.3.1/Cargo.toml.orig 0000644 0000000 0000000 00000000465 00726746425 0016054 0 ustar 0000000 0000000 [package]
name = "route-recognizer"
description = "Recognizes URL patterns with support for dynamic and wildcard segments"
license = "MIT"
repository = "https://github.com/rustasync/route-recognizer"
keywords = ["router", "url"]
edition = "2018"
version = "0.3.1"
authors = ["wycats", "rustasync"]
route-recognizer-0.3.1/LICENSE-MIT 0000644 0000000 0000000 00000002247 00726746425 0014621 0 ustar 0000000 0000000 The MIT License (MIT)
Copyright (c) 2020 The http-rs contributors
Copyright (c) 2019 The rustasync contributors
Copyright (c) 2014 Yehuda Katz
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
route-recognizer-0.3.1/README.md 0000644 0000000 0000000 00000004335 00726746425 0014444 0 ustar 0000000 0000000
route-recognizer
Recognizes URL patterns with support for dynamic and wildcard segments
## Installation
```sh
$ cargo add route-recognizer
```
## Safety
This crate uses ``#![deny(unsafe_code)]`` to ensure everything is implemented in
100% Safe Rust.
## Contributing
Want to join us? Check out our ["Contributing" guide][contributing] and take a
look at some of these issues:
- [Issues labeled "good first issue"][good-first-issue]
- [Issues labeled "help wanted"][help-wanted]
[contributing]: https://github.com/http-rs/route-recognizer/blob/master.github/CONTRIBUTING.md
[good-first-issue]: https://github.com/http-rs/route-recognizer/labels/good%20first%20issue
[help-wanted]: https://github.com/http-rs/route-recognizer/labels/help%20wanted
## License
Licensed under either the MIT license at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted
for inclusion in this crate by you, as defined in the Apache-2.0 license, shall
be dual licensed as MIT / Apache-2.0, without any additional terms or
conditions.
route-recognizer-0.3.1/benches/bench.rs 0000644 0000000 0000000 00000001152 00726746425 0016213 0 ustar 0000000 0000000 #![feature(test)]
extern crate route_recognizer;
extern crate test;
use route_recognizer::Router;
#[bench]
fn benchmark(b: &mut test::Bencher) {
let mut router = Router::new();
router.add("/posts/:post_id/comments/:id", "comment".to_string());
router.add("/posts/:post_id/comments", "comments".to_string());
router.add("/posts/:post_id", "post".to_string());
router.add("/posts", "posts".to_string());
router.add("/comments", "comments2".to_string());
router.add("/comments/:id", "comment2".to_string());
b.iter(|| router.recognize("/posts/100/comments/200"));
}
route-recognizer-0.3.1/benches/nfa.rs 0000644 0000000 0000000 00000002050 00726746425 0015676 0 ustar 0000000 0000000 #![feature(test)]
extern crate route_recognizer;
extern crate test;
use route_recognizer::nfa::CharSet;
use std::collections::{BTreeSet, HashSet};
#[bench]
fn bench_char_set(b: &mut test::Bencher) {
let mut set = CharSet::new();
set.insert('p');
set.insert('n');
set.insert('/');
b.iter(|| {
assert!(set.contains('p'));
assert!(set.contains('/'));
assert!(!set.contains('z'));
});
}
#[bench]
fn bench_hash_set(b: &mut test::Bencher) {
let mut set = HashSet::new();
set.insert('p');
set.insert('n');
set.insert('/');
b.iter(|| {
assert!(set.contains(&'p'));
assert!(set.contains(&'/'));
assert!(!set.contains(&'z'));
});
}
#[bench]
fn bench_btree_set(b: &mut test::Bencher) {
let mut set = BTreeSet::new();
set.insert('p');
set.insert('n');
set.insert('/');
b.iter(|| {
assert!(set.contains(&'p'));
assert!(set.contains(&'/'));
assert!(!set.contains(&'z'));
});
}
route-recognizer-0.3.1/src/lib.rs 0000644 0000000 0000000 00000042351 00726746425 0015070 0 ustar 0000000 0000000 //! Recognizes URL patterns with support for dynamic and wildcard segments
//!
//! # Examples
//!
//! ```
//! use route_recognizer::{Router, Params};
//!
//! let mut router = Router::new();
//!
//! router.add("/thomas", "Thomas".to_string());
//! router.add("/tom", "Tom".to_string());
//! router.add("/wycats", "Yehuda".to_string());
//!
//! let m = router.recognize("/thomas").unwrap();
//!
//! assert_eq!(m.handler().as_str(), "Thomas");
//! assert_eq!(m.params(), &Params::new());
//! ```
//!
//! # Routing params
//!
//! The router supports four kinds of route segments:
//! - __segments__: these are of the format `/a/b`.
//! - __params__: these are of the format `/a/:b`.
//! - __named wildcards__: these are of the format `/a/*b`.
//! - __unnamed wildcards__: these are of the format `/a/*`.
//!
//! The difference between a "named wildcard" and a "param" is how the
//! matching rules apply. Given the router `/a/:b`, passing in `/foo/bar/baz`
//! will not match because `/baz` has no counterpart in the router.
//!
//! However if we define the route `/a/*b` and we pass `/foo/bar/baz` we end up
//! with a named param `"b"` that contains the value `"bar/baz"`. Wildcard
//! routing rules are useful when you don't know which routes may follow. The
//! difference between "named" and "unnamed" wildcards is that the former will
//! show up in `Params`, while the latter won't.
#![cfg_attr(feature = "docs", feature(doc_cfg))]
#![deny(unsafe_code)]
#![deny(missing_debug_implementations, nonstandard_style)]
#![warn(missing_docs, unreachable_pub, future_incompatible, rust_2018_idioms)]
#![doc(test(attr(deny(warnings))))]
#![doc(test(attr(allow(unused_extern_crates, unused_variables))))]
#![doc(html_favicon_url = "https://yoshuawuyts.com/assets/http-rs/favicon.ico")]
#![doc(html_logo_url = "https://yoshuawuyts.com/assets/http-rs/logo-rounded.png")]
use std::cmp::Ordering;
use std::collections::{btree_map, BTreeMap};
use std::ops::Index;
use crate::nfa::{CharacterClass, NFA};
#[doc(hidden)]
pub mod nfa;
#[derive(Clone, Eq, Debug)]
struct Metadata {
statics: u32,
dynamics: u32,
wildcards: u32,
param_names: Vec,
}
impl Metadata {
pub(crate) fn new() -> Self {
Self {
statics: 0,
dynamics: 0,
wildcards: 0,
param_names: Vec::new(),
}
}
}
impl Ord for Metadata {
fn cmp(&self, other: &Self) -> Ordering {
if self.statics > other.statics {
Ordering::Greater
} else if self.statics < other.statics {
Ordering::Less
} else if self.dynamics > other.dynamics {
Ordering::Greater
} else if self.dynamics < other.dynamics {
Ordering::Less
} else if self.wildcards > other.wildcards {
Ordering::Greater
} else if self.wildcards < other.wildcards {
Ordering::Less
} else {
Ordering::Equal
}
}
}
impl PartialOrd for Metadata {
fn partial_cmp(&self, other: &Self) -> Option {
Some(self.cmp(other))
}
}
impl PartialEq for Metadata {
fn eq(&self, other: &Self) -> bool {
self.statics == other.statics
&& self.dynamics == other.dynamics
&& self.wildcards == other.wildcards
}
}
/// Router parameters.
#[derive(PartialEq, Clone, Debug, Default)]
pub struct Params {
map: BTreeMap,
}
impl Params {
/// Create a new instance of `Params`.
pub fn new() -> Self {
Self {
map: BTreeMap::new(),
}
}
/// Insert a new param into `Params`.
pub fn insert(&mut self, key: String, value: String) {
self.map.insert(key, value);
}
/// Find a param by name in `Params`.
pub fn find(&self, key: &str) -> Option<&str> {
self.map.get(key).map(|s| &s[..])
}
/// Iterate over all named params.
///
/// This will return all named params and named wildcards.
pub fn iter(&self) -> Iter<'_> {
Iter(self.map.iter())
}
}
impl Index<&str> for Params {
type Output = String;
fn index(&self, index: &str) -> &String {
match self.map.get(index) {
None => panic!("params[{}] did not exist", index),
Some(s) => s,
}
}
}
impl<'a> IntoIterator for &'a Params {
type IntoIter = Iter<'a>;
type Item = (&'a str, &'a str);
fn into_iter(self) -> Iter<'a> {
self.iter()
}
}
/// An iterator over `Params`.
#[derive(Debug)]
pub struct Iter<'a>(btree_map::Iter<'a, String, String>);
impl<'a> Iterator for Iter<'a> {
type Item = (&'a str, &'a str);
#[inline]
fn next(&mut self) -> Option<(&'a str, &'a str)> {
self.0.next().map(|(k, v)| (&**k, &**v))
}
fn size_hint(&self) -> (usize, Option) {
self.0.size_hint()
}
}
/// The result of a successful match returned by `Router::recognize`.
#[derive(Debug)]
pub struct Match {
/// Return the endpoint handler.
handler: T,
/// Return the params.
params: Params,
}
impl Match {
/// Create a new instance of `Match`.
pub fn new(handler: T, params: Params) -> Self {
Self { handler, params }
}
/// Get a handle to the handler.
pub fn handler(&self) -> &T {
&self.handler
}
/// Get a mutable handle to the handler.
pub fn handler_mut(&mut self) -> &mut T {
&mut self.handler
}
/// Get a handle to the params.
pub fn params(&self) -> &Params {
&self.params
}
/// Get a mutable handle to the params.
pub fn params_mut(&mut self) -> &mut Params {
&mut self.params
}
}
/// Recognizes URL patterns with support for dynamic and wildcard segments.
#[derive(Clone, Debug)]
pub struct Router {
nfa: NFA,
handlers: BTreeMap,
}
fn segments(route: &str) -> Vec<(Option, &str)> {
let predicate = |c| c == '.' || c == '/';
let mut segments = vec![];
let mut segment_start = 0;
while segment_start < route.len() {
let segment_end = route[segment_start + 1..]
.find(predicate)
.map(|i| i + segment_start + 1)
.unwrap_or_else(|| route.len());
let potential_sep = route.chars().nth(segment_start);
let sep_and_segment = match potential_sep {
Some(sep) if predicate(sep) => (Some(sep), &route[segment_start + 1..segment_end]),
_ => (None, &route[segment_start..segment_end]),
};
segments.push(sep_and_segment);
segment_start = segment_end;
}
segments
}
impl Router {
/// Create a new instance of `Router`.
pub fn new() -> Self {
Self {
nfa: NFA::new(),
handlers: BTreeMap::new(),
}
}
/// Add a route to the router.
pub fn add(&mut self, mut route: &str, dest: T) {
if !route.is_empty() && route.as_bytes()[0] == b'/' {
route = &route[1..];
}
let nfa = &mut self.nfa;
let mut state = 0;
let mut metadata = Metadata::new();
for (separator, segment) in segments(route) {
if let Some(separator) = separator {
state = nfa.put(state, CharacterClass::valid_char(separator));
}
if !segment.is_empty() && segment.as_bytes()[0] == b':' {
state = process_dynamic_segment(nfa, state);
metadata.dynamics += 1;
metadata.param_names.push(segment[1..].to_string());
} else if !segment.is_empty() && segment.as_bytes()[0] == b'*' {
state = process_star_state(nfa, state);
metadata.wildcards += 1;
metadata.param_names.push(segment[1..].to_string());
} else {
state = process_static_segment(segment, nfa, state);
metadata.statics += 1;
}
}
nfa.acceptance(state);
nfa.metadata(state, metadata);
self.handlers.insert(state, dest);
}
/// Match a route on the router.
pub fn recognize(&self, mut path: &str) -> Result, String> {
if !path.is_empty() && path.as_bytes()[0] == b'/' {
path = &path[1..];
}
let nfa = &self.nfa;
let result = nfa.process(path, |index| nfa.get(index).metadata.as_ref().unwrap());
match result {
Ok(nfa_match) => {
let mut map = Params::new();
let state = &nfa.get(nfa_match.state);
let metadata = state.metadata.as_ref().unwrap();
let param_names = metadata.param_names.clone();
for (i, capture) in nfa_match.captures.iter().enumerate() {
if !param_names[i].is_empty() {
map.insert(param_names[i].to_string(), capture.to_string());
}
}
let handler = self.handlers.get(&nfa_match.state).unwrap();
Ok(Match::new(handler, map))
}
Err(str) => Err(str),
}
}
}
impl Default for Router {
fn default() -> Self {
Self::new()
}
}
fn process_static_segment(segment: &str, nfa: &mut NFA, mut state: usize) -> usize {
for char in segment.chars() {
state = nfa.put(state, CharacterClass::valid_char(char));
}
state
}
fn process_dynamic_segment(nfa: &mut NFA, mut state: usize) -> usize {
state = nfa.put(state, CharacterClass::invalid_char('/'));
nfa.put_state(state, state);
nfa.start_capture(state);
nfa.end_capture(state);
state
}
fn process_star_state(nfa: &mut NFA, mut state: usize) -> usize {
state = nfa.put(state, CharacterClass::any());
nfa.put_state(state, state);
nfa.start_capture(state);
nfa.end_capture(state);
state
}
#[cfg(test)]
mod tests {
use super::{Params, Router};
#[test]
fn basic_router() {
let mut router = Router::new();
router.add("/thomas", "Thomas".to_string());
router.add("/tom", "Tom".to_string());
router.add("/wycats", "Yehuda".to_string());
let m = router.recognize("/thomas").unwrap();
assert_eq!(*m.handler, "Thomas".to_string());
assert_eq!(m.params, Params::new());
}
#[test]
fn root_router() {
let mut router = Router::new();
router.add("/", 10);
assert_eq!(*router.recognize("/").unwrap().handler, 10)
}
#[test]
fn empty_path() {
let mut router = Router::new();
router.add("/", 12);
assert_eq!(*router.recognize("").unwrap().handler, 12)
}
#[test]
fn empty_route() {
let mut router = Router::new();
router.add("", 12);
assert_eq!(*router.recognize("/").unwrap().handler, 12)
}
#[test]
fn ambiguous_router() {
let mut router = Router::new();
router.add("/posts/new", "new".to_string());
router.add("/posts/:id", "id".to_string());
let id = router.recognize("/posts/1").unwrap();
assert_eq!(*id.handler, "id".to_string());
assert_eq!(id.params, params("id", "1"));
let new = router.recognize("/posts/new").unwrap();
assert_eq!(*new.handler, "new".to_string());
assert_eq!(new.params, Params::new());
}
#[test]
fn ambiguous_router_b() {
let mut router = Router::new();
router.add("/posts/:id", "id".to_string());
router.add("/posts/new", "new".to_string());
let id = router.recognize("/posts/1").unwrap();
assert_eq!(*id.handler, "id".to_string());
assert_eq!(id.params, params("id", "1"));
let new = router.recognize("/posts/new").unwrap();
assert_eq!(*new.handler, "new".to_string());
assert_eq!(new.params, Params::new());
}
#[test]
fn multiple_params() {
let mut router = Router::new();
router.add("/posts/:post_id/comments/:id", "comment".to_string());
router.add("/posts/:post_id/comments", "comments".to_string());
let com = router.recognize("/posts/12/comments/100").unwrap();
let coms = router.recognize("/posts/12/comments").unwrap();
assert_eq!(*com.handler, "comment".to_string());
assert_eq!(com.params, two_params("post_id", "12", "id", "100"));
assert_eq!(*coms.handler, "comments".to_string());
assert_eq!(coms.params, params("post_id", "12"));
assert_eq!(coms.params["post_id"], "12".to_string());
}
#[test]
fn wildcard() {
let mut router = Router::new();
router.add("*foo", "test".to_string());
router.add("/bar/*foo", "test2".to_string());
let m = router.recognize("/test").unwrap();
assert_eq!(*m.handler, "test".to_string());
assert_eq!(m.params, params("foo", "test"));
let m = router.recognize("/foo/bar").unwrap();
assert_eq!(*m.handler, "test".to_string());
assert_eq!(m.params, params("foo", "foo/bar"));
let m = router.recognize("/bar/foo").unwrap();
assert_eq!(*m.handler, "test2".to_string());
assert_eq!(m.params, params("foo", "foo"));
}
#[test]
fn wildcard_colon() {
let mut router = Router::new();
router.add("/a/*b", "ab".to_string());
router.add("/a/*b/c", "abc".to_string());
router.add("/a/*b/c/:d", "abcd".to_string());
let m = router.recognize("/a/foo").unwrap();
assert_eq!(*m.handler, "ab".to_string());
assert_eq!(m.params, params("b", "foo"));
let m = router.recognize("/a/foo/bar").unwrap();
assert_eq!(*m.handler, "ab".to_string());
assert_eq!(m.params, params("b", "foo/bar"));
let m = router.recognize("/a/foo/c").unwrap();
assert_eq!(*m.handler, "abc".to_string());
assert_eq!(m.params, params("b", "foo"));
let m = router.recognize("/a/foo/bar/c").unwrap();
assert_eq!(*m.handler, "abc".to_string());
assert_eq!(m.params, params("b", "foo/bar"));
let m = router.recognize("/a/foo/c/baz").unwrap();
assert_eq!(*m.handler, "abcd".to_string());
assert_eq!(m.params, two_params("b", "foo", "d", "baz"));
let m = router.recognize("/a/foo/bar/c/baz").unwrap();
assert_eq!(*m.handler, "abcd".to_string());
assert_eq!(m.params, two_params("b", "foo/bar", "d", "baz"));
let m = router.recognize("/a/foo/bar/c/baz/bay").unwrap();
assert_eq!(*m.handler, "ab".to_string());
assert_eq!(m.params, params("b", "foo/bar/c/baz/bay"));
}
#[test]
fn unnamed_parameters() {
let mut router = Router::new();
router.add("/foo/:/bar", "test".to_string());
router.add("/foo/:bar/*", "test2".to_string());
let m = router.recognize("/foo/test/bar").unwrap();
assert_eq!(*m.handler, "test");
assert_eq!(m.params, Params::new());
let m = router.recognize("/foo/test/blah").unwrap();
assert_eq!(*m.handler, "test2");
assert_eq!(m.params, params("bar", "test"));
}
fn params(key: &str, val: &str) -> Params {
let mut map = Params::new();
map.insert(key.to_string(), val.to_string());
map
}
fn two_params(k1: &str, v1: &str, k2: &str, v2: &str) -> Params {
let mut map = Params::new();
map.insert(k1.to_string(), v1.to_string());
map.insert(k2.to_string(), v2.to_string());
map
}
#[test]
fn dot() {
let mut router = Router::new();
router.add("/1/baz.:wibble", ());
router.add("/2/:bar.baz", ());
router.add("/3/:dynamic.:extension", ());
router.add("/4/static.static", ());
let m = router.recognize("/1/baz.jpg").unwrap();
assert_eq!(m.params, params("wibble", "jpg"));
let m = router.recognize("/2/test.baz").unwrap();
assert_eq!(m.params, params("bar", "test"));
let m = router.recognize("/3/any.thing").unwrap();
assert_eq!(m.params, two_params("dynamic", "any", "extension", "thing"));
let m = router.recognize("/3/this.performs.a.greedy.match").unwrap();
assert_eq!(
m.params,
two_params("dynamic", "this.performs.a.greedy", "extension", "match")
);
let m = router.recognize("/4/static.static").unwrap();
assert_eq!(m.params, Params::new());
let m = router.recognize("/4/static/static");
assert!(m.is_err());
let m = router.recognize("/4.static.static");
assert!(m.is_err());
}
#[test]
fn test_chinese() {
let mut router = Router::new();
router.add("/crates/:foo/:bar", "Hello".to_string());
let m = router.recognize("/crates/实打实打算/d's'd").unwrap();
assert_eq!(m.handler().as_str(), "Hello");
assert_eq!(m.params().find("foo"), Some("实打实打算"));
assert_eq!(m.params().find("bar"), Some("d's'd"));
}
}
route-recognizer-0.3.1/src/nfa.rs 0000644 0000000 0000000 00000042565 00726746425 0015075 0 ustar 0000000 0000000 use std::collections::HashSet;
use self::CharacterClass::{Ascii, InvalidChars, ValidChars};
#[derive(PartialEq, Eq, Clone, Default, Debug)]
pub struct CharSet {
low_mask: u64,
high_mask: u64,
non_ascii: HashSet,
}
impl CharSet {
pub fn new() -> Self {
Self {
low_mask: 0,
high_mask: 0,
non_ascii: HashSet::new(),
}
}
pub fn insert(&mut self, char: char) {
let val = char as u32 - 1;
if val > 127 {
self.non_ascii.insert(char);
} else if val > 63 {
let bit = 1 << (val - 64);
self.high_mask |= bit;
} else {
let bit = 1 << val;
self.low_mask |= bit;
}
}
pub fn contains(&self, char: char) -> bool {
let val = char as u32 - 1;
if val > 127 {
self.non_ascii.contains(&char)
} else if val > 63 {
let bit = 1 << (val - 64);
self.high_mask & bit != 0
} else {
let bit = 1 << val;
self.low_mask & bit != 0
}
}
}
#[derive(PartialEq, Eq, Clone, Debug)]
pub enum CharacterClass {
Ascii(u64, u64, bool),
ValidChars(CharSet),
InvalidChars(CharSet),
}
impl CharacterClass {
pub fn any() -> Self {
Ascii(u64::max_value(), u64::max_value(), true)
}
pub fn valid(string: &str) -> Self {
ValidChars(Self::str_to_set(string))
}
pub fn invalid(string: &str) -> Self {
InvalidChars(Self::str_to_set(string))
}
pub fn valid_char(char: char) -> Self {
let val = char as u32 - 1;
if val > 127 {
ValidChars(Self::char_to_set(char))
} else if val > 63 {
Ascii(1 << (val - 64), 0, false)
} else {
Ascii(0, 1 << val, false)
}
}
pub fn invalid_char(char: char) -> Self {
let val = char as u32 - 1;
if val > 127 {
InvalidChars(Self::char_to_set(char))
} else if val > 63 {
Ascii(u64::max_value() ^ (1 << (val - 64)), u64::max_value(), true)
} else {
Ascii(u64::max_value(), u64::max_value() ^ (1 << val), true)
}
}
pub fn matches(&self, char: char) -> bool {
match *self {
ValidChars(ref valid) => valid.contains(char),
InvalidChars(ref invalid) => !invalid.contains(char),
Ascii(high, low, unicode) => {
let val = char as u32 - 1;
if val > 127 {
unicode
} else if val > 63 {
high & (1 << (val - 64)) != 0
} else {
low & (1 << val) != 0
}
}
}
}
fn char_to_set(char: char) -> CharSet {
let mut set = CharSet::new();
set.insert(char);
set
}
fn str_to_set(string: &str) -> CharSet {
let mut set = CharSet::new();
for char in string.chars() {
set.insert(char);
}
set
}
}
#[derive(Clone)]
struct Thread {
state: usize,
captures: Vec<(usize, usize)>,
capture_begin: Option,
}
impl Thread {
pub(crate) fn new() -> Self {
Self {
state: 0,
captures: Vec::new(),
capture_begin: None,
}
}
#[inline]
pub(crate) fn start_capture(&mut self, start: usize) {
self.capture_begin = Some(start);
}
#[inline]
pub(crate) fn end_capture(&mut self, end: usize) {
self.captures.push((self.capture_begin.unwrap(), end));
self.capture_begin = None;
}
pub(crate) fn extract<'a>(&self, source: &'a str) -> Vec<&'a str> {
self.captures
.iter()
.map(|&(begin, end)| &source[begin..end])
.collect()
}
}
#[derive(Clone, Debug)]
pub struct State {
pub index: usize,
pub chars: CharacterClass,
pub next_states: Vec,
pub acceptance: bool,
pub start_capture: bool,
pub end_capture: bool,
pub metadata: Option,
}
impl PartialEq for State {
fn eq(&self, other: &Self) -> bool {
self.index == other.index
}
}
impl State {
pub fn new(index: usize, chars: CharacterClass) -> Self {
Self {
index,
chars,
next_states: Vec::new(),
acceptance: false,
start_capture: false,
end_capture: false,
metadata: None,
}
}
}
#[derive(Debug)]
pub struct Match<'a> {
pub state: usize,
pub captures: Vec<&'a str>,
}
impl<'a> Match<'a> {
pub fn new(state: usize, captures: Vec<&'_ str>) -> Match<'_> {
Match { state, captures }
}
}
#[derive(Clone, Default, Debug)]
pub struct NFA {
states: Vec>,
start_capture: Vec,
end_capture: Vec,
acceptance: Vec,
}
impl NFA {
pub fn new() -> Self {
let root = State::new(0, CharacterClass::any());
Self {
states: vec![root],
start_capture: vec![false],
end_capture: vec![false],
acceptance: vec![false],
}
}
pub fn process<'a, I, F>(&self, string: &'a str, mut ord: F) -> Result, String>
where
I: Ord,
F: FnMut(usize) -> I,
{
let mut threads = vec![Thread::new()];
for (i, char) in string.char_indices() {
let next_threads = self.process_char(threads, char, i);
if next_threads.is_empty() {
return Err(format!("Couldn't process {}", string));
}
threads = next_threads;
}
let returned = threads
.into_iter()
.filter(|thread| self.get(thread.state).acceptance);
let thread = returned
.fold(None, |prev, y| {
let y_v = ord(y.state);
match prev {
None => Some((y_v, y)),
Some((x_v, x)) => {
if x_v < y_v {
Some((y_v, y))
} else {
Some((x_v, x))
}
}
}
})
.map(|p| p.1);
match thread {
None => Err("The string was exhausted before reaching an \
acceptance state"
.to_string()),
Some(mut thread) => {
if thread.capture_begin.is_some() {
thread.end_capture(string.len());
}
let state = self.get(thread.state);
Ok(Match::new(state.index, thread.extract(string)))
}
}
}
#[inline]
fn process_char(&self, threads: Vec, char: char, pos: usize) -> Vec {
let mut returned = Vec::with_capacity(threads.len());
for mut thread in threads {
let current_state = self.get(thread.state);
let mut count = 0;
let mut found_state = 0;
for &index in ¤t_state.next_states {
let state = &self.states[index];
if state.chars.matches(char) {
count += 1;
found_state = index;
}
}
if count == 1 {
thread.state = found_state;
capture(self, &mut thread, current_state.index, found_state, pos);
returned.push(thread);
continue;
}
for &index in ¤t_state.next_states {
let state = &self.states[index];
if state.chars.matches(char) {
let mut thread = fork_thread(&thread, state);
capture(self, &mut thread, current_state.index, index, pos);
returned.push(thread);
}
}
}
returned
}
#[inline]
pub fn get(&self, state: usize) -> &State {
&self.states[state]
}
pub fn get_mut(&mut self, state: usize) -> &mut State {
&mut self.states[state]
}
pub fn put(&mut self, index: usize, chars: CharacterClass) -> usize {
{
let state = self.get(index);
for &index in &state.next_states {
let state = self.get(index);
if state.chars == chars {
return index;
}
}
}
let state = self.new_state(chars);
self.get_mut(index).next_states.push(state);
state
}
pub fn put_state(&mut self, index: usize, child: usize) {
if !self.states[index].next_states.contains(&child) {
self.get_mut(index).next_states.push(child);
}
}
pub fn acceptance(&mut self, index: usize) {
self.get_mut(index).acceptance = true;
self.acceptance[index] = true;
}
pub fn start_capture(&mut self, index: usize) {
self.get_mut(index).start_capture = true;
self.start_capture[index] = true;
}
pub fn end_capture(&mut self, index: usize) {
self.get_mut(index).end_capture = true;
self.end_capture[index] = true;
}
pub fn metadata(&mut self, index: usize, metadata: T) {
self.get_mut(index).metadata = Some(metadata);
}
fn new_state(&mut self, chars: CharacterClass) -> usize {
let index = self.states.len();
let state = State::new(index, chars);
self.states.push(state);
self.acceptance.push(false);
self.start_capture.push(false);
self.end_capture.push(false);
index
}
}
#[inline]
fn fork_thread(thread: &Thread, state: &State) -> Thread {
let mut new_trace = thread.clone();
new_trace.state = state.index;
new_trace
}
#[inline]
fn capture(
nfa: &NFA,
thread: &mut Thread,
current_state: usize,
next_state: usize,
pos: usize,
) {
if thread.capture_begin == None && nfa.start_capture[next_state] {
thread.start_capture(pos);
}
if thread.capture_begin != None && nfa.end_capture[current_state] && next_state > current_state
{
thread.end_capture(pos);
}
}
#[cfg(test)]
mod tests {
use super::{CharSet, CharacterClass, NFA};
#[test]
fn basic_test() {
let mut nfa = NFA::<()>::new();
let a = nfa.put(0, CharacterClass::valid("h"));
let b = nfa.put(a, CharacterClass::valid("e"));
let c = nfa.put(b, CharacterClass::valid("l"));
let d = nfa.put(c, CharacterClass::valid("l"));
let e = nfa.put(d, CharacterClass::valid("o"));
nfa.acceptance(e);
let m = nfa.process("hello", |a| a);
assert!(
m.unwrap().state == e,
"You didn't get the right final state"
);
}
#[test]
fn multiple_solutions() {
let mut nfa = NFA::<()>::new();
let a1 = nfa.put(0, CharacterClass::valid("n"));
let b1 = nfa.put(a1, CharacterClass::valid("e"));
let c1 = nfa.put(b1, CharacterClass::valid("w"));
nfa.acceptance(c1);
let a2 = nfa.put(0, CharacterClass::invalid(""));
let b2 = nfa.put(a2, CharacterClass::invalid(""));
let c2 = nfa.put(b2, CharacterClass::invalid(""));
nfa.acceptance(c2);
let m = nfa.process("new", |a| a);
assert!(m.unwrap().state == c2, "The two states were not found");
}
#[test]
fn multiple_paths() {
let mut nfa = NFA::<()>::new();
let a = nfa.put(0, CharacterClass::valid("t")); // t
let b1 = nfa.put(a, CharacterClass::valid("h")); // th
let c1 = nfa.put(b1, CharacterClass::valid("o")); // tho
let d1 = nfa.put(c1, CharacterClass::valid("m")); // thom
let e1 = nfa.put(d1, CharacterClass::valid("a")); // thoma
let f1 = nfa.put(e1, CharacterClass::valid("s")); // thomas
let b2 = nfa.put(a, CharacterClass::valid("o")); // to
let c2 = nfa.put(b2, CharacterClass::valid("m")); // tom
nfa.acceptance(f1);
nfa.acceptance(c2);
let thomas = nfa.process("thomas", |a| a);
let tom = nfa.process("tom", |a| a);
let thom = nfa.process("thom", |a| a);
let nope = nfa.process("nope", |a| a);
assert!(thomas.unwrap().state == f1, "thomas was parsed correctly");
assert!(tom.unwrap().state == c2, "tom was parsed correctly");
assert!(thom.is_err(), "thom didn't reach an acceptance state");
assert!(nope.is_err(), "nope wasn't parsed");
}
#[test]
fn repetitions() {
let mut nfa = NFA::<()>::new();
let a = nfa.put(0, CharacterClass::valid("p")); // p
let b = nfa.put(a, CharacterClass::valid("o")); // po
let c = nfa.put(b, CharacterClass::valid("s")); // pos
let d = nfa.put(c, CharacterClass::valid("t")); // post
let e = nfa.put(d, CharacterClass::valid("s")); // posts
let f = nfa.put(e, CharacterClass::valid("/")); // posts/
let g = nfa.put(f, CharacterClass::invalid("/")); // posts/[^/]
nfa.put_state(g, g);
nfa.acceptance(g);
let post = nfa.process("posts/1", |a| a);
let new_post = nfa.process("posts/new", |a| a);
let invalid = nfa.process("posts/", |a| a);
assert!(post.unwrap().state == g, "posts/1 was parsed");
assert!(new_post.unwrap().state == g, "posts/new was parsed");
assert!(invalid.is_err(), "posts/ was invalid");
}
#[test]
fn repetitions_with_ambiguous() {
let mut nfa = NFA::<()>::new();
let a = nfa.put(0, CharacterClass::valid("p")); // p
let b = nfa.put(a, CharacterClass::valid("o")); // po
let c = nfa.put(b, CharacterClass::valid("s")); // pos
let d = nfa.put(c, CharacterClass::valid("t")); // post
let e = nfa.put(d, CharacterClass::valid("s")); // posts
let f = nfa.put(e, CharacterClass::valid("/")); // posts/
let g1 = nfa.put(f, CharacterClass::invalid("/")); // posts/[^/]
let g2 = nfa.put(f, CharacterClass::valid("n")); // posts/n
let h2 = nfa.put(g2, CharacterClass::valid("e")); // posts/ne
let i2 = nfa.put(h2, CharacterClass::valid("w")); // posts/new
nfa.put_state(g1, g1);
nfa.acceptance(g1);
nfa.acceptance(i2);
let post = nfa.process("posts/1", |a| a);
let ambiguous = nfa.process("posts/new", |a| a);
let invalid = nfa.process("posts/", |a| a);
assert!(post.unwrap().state == g1, "posts/1 was parsed");
assert!(ambiguous.unwrap().state == i2, "posts/new was ambiguous");
assert!(invalid.is_err(), "posts/ was invalid");
}
#[test]
fn captures() {
let mut nfa = NFA::<()>::new();
let a = nfa.put(0, CharacterClass::valid("n"));
let b = nfa.put(a, CharacterClass::valid("e"));
let c = nfa.put(b, CharacterClass::valid("w"));
nfa.acceptance(c);
nfa.start_capture(a);
nfa.end_capture(c);
let post = nfa.process("new", |a| a);
assert_eq!(post.unwrap().captures, vec!["new"]);
}
#[test]
fn capture_mid_match() {
let mut nfa = NFA::<()>::new();
let a = nfa.put(0, valid('p'));
let b = nfa.put(a, valid('/'));
let c = nfa.put(b, invalid('/'));
let d = nfa.put(c, valid('/'));
let e = nfa.put(d, valid('c'));
nfa.put_state(c, c);
nfa.acceptance(e);
nfa.start_capture(c);
nfa.end_capture(c);
let post = nfa.process("p/123/c", |a| a);
assert_eq!(post.unwrap().captures, vec!["123"]);
}
#[test]
fn capture_multiple_captures() {
let mut nfa = NFA::<()>::new();
let a = nfa.put(0, valid('p'));
let b = nfa.put(a, valid('/'));
let c = nfa.put(b, invalid('/'));
let d = nfa.put(c, valid('/'));
let e = nfa.put(d, valid('c'));
let f = nfa.put(e, valid('/'));
let g = nfa.put(f, invalid('/'));
nfa.put_state(c, c);
nfa.put_state(g, g);
nfa.acceptance(g);
nfa.start_capture(c);
nfa.end_capture(c);
nfa.start_capture(g);
nfa.end_capture(g);
let post = nfa.process("p/123/c/456", |a| a);
assert_eq!(post.unwrap().captures, vec!["123", "456"]);
}
#[test]
fn test_ascii_set() {
let mut set = CharSet::new();
set.insert('?');
set.insert('a');
set.insert('é');
assert!(set.contains('?'), "The set contains char 63");
assert!(set.contains('a'), "The set contains char 97");
assert!(set.contains('é'), "The set contains char 233");
assert!(!set.contains('q'), "The set does not contain q");
assert!(!set.contains('ü'), "The set does not contain ü");
}
fn valid(char: char) -> CharacterClass {
CharacterClass::valid_char(char)
}
fn invalid(char: char) -> CharacterClass {
CharacterClass::invalid_char(char)
}
}