route-recognizer-0.3.1/.cargo_vcs_info.json0000644000000001360000000000100143400ustar { "git": { "sha1": "14b94114a83a4e0d4398bce2e77dc501f0f7ab1a" }, "path_in_vcs": "" }route-recognizer-0.3.1/.github/CODE_OF_CONDUCT000064400000000000000000000062650072674642500167310ustar 00000000000000# Contributor Covenant Code of Conduct ## Our Pledge In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation. ## Our Standards Examples of behavior that contributes to creating a positive environment include: - Using welcoming and inclusive language - Being respectful of differing viewpoints and experiences - Gracefully accepting constructive criticism - Focusing on what is best for the community - Showing empathy towards other community members Examples of unacceptable behavior by participants include: - The use of sexualized language or imagery and unwelcome sexual attention or advances - Trolling, insulting/derogatory comments, and personal or political attacks - Public or private harassment - Publishing others' private information, such as a physical or electronic address, without explicit permission - Other conduct which could reasonably be considered inappropriate in a professional setting ## Our Responsibilities Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior. Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful. ## Scope This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers. ## Enforcement Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at yoshuawuyts@gmail.com, or through IRC. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately. Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership. ## Attribution This Code of Conduct is adapted from the Contributor Covenant, version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html route-recognizer-0.3.1/.github/CONTRIBUTING.md000064400000000000000000000052660072674642500167620ustar 00000000000000# Contributing Contributions include code, documentation, answering user questions, running the project's infrastructure, and advocating for all types of users. The project welcomes all contributions from anyone willing to work in good faith with other contributors and the community. No contribution is too small and all contributions are valued. This guide explains the process for contributing to the project's GitHub Repository. - [Code of Conduct](#code-of-conduct) - [Bad Actors](#bad-actors) ## Code of Conduct The project has a [Code of Conduct](./CODE_OF_CONDUCT.md) that *all* contributors are expected to follow. This code describes the *minimum* behavior expectations for all contributors. As a contributor, how you choose to act and interact towards your fellow contributors, as well as to the community, will reflect back not only on yourself but on the project as a whole. The Code of Conduct is designed and intended, above all else, to help establish a culture within the project that allows anyone and everyone who wants to contribute to feel safe doing so. Should any individual act in any way that is considered in violation of the [Code of Conduct](./CODE_OF_CONDUCT.md), corrective actions will be taken. It is possible, however, for any individual to *act* in such a manner that is not in violation of the strict letter of the Code of Conduct guidelines while still going completely against the spirit of what that Code is intended to accomplish. Open, diverse, and inclusive communities live and die on the basis of trust. Contributors can disagree with one another so long as they trust that those disagreements are in good faith and everyone is working towards a common goal. ## Bad Actors All contributors to tacitly agree to abide by both the letter and spirit of the [Code of Conduct](./CODE_OF_CONDUCT.md). Failure, or unwillingness, to do so will result in contributions being respectfully declined. A *bad actor* is someone who repeatedly violates the *spirit* of the Code of Conduct through consistent failure to self-regulate the way in which they interact with other contributors in the project. In doing so, bad actors alienate other contributors, discourage collaboration, and generally reflect poorly on the project as a whole. Being a bad actor may be intentional or unintentional. Typically, unintentional bad behavior can be easily corrected by being quick to apologize and correct course *even if you are not entirely convinced you need to*. Giving other contributors the benefit of the doubt and having a sincere willingness to admit that you *might* be wrong is critical for any successful open collaboration. Don't be a bad actor. route-recognizer-0.3.1/.github/workflows/ci.yaml000064400000000000000000000022050072674642500200330ustar 00000000000000name: CI on: pull_request: push: branches: - main - staging - trying env: RUSTFLAGS: -Dwarnings jobs: build_and_test: name: Build and test runs-on: ${{ matrix.os }} strategy: matrix: os: [ubuntu-latest, windows-latest, macOS-latest] rust: [stable] steps: - uses: actions/checkout@master - name: Install ${{ matrix.rust }} uses: actions-rs/toolchain@v1 with: toolchain: ${{ matrix.rust }} override: true - name: check uses: actions-rs/cargo@v1 with: command: check args: --all --bins --examples - name: tests uses: actions-rs/cargo@v1 with: command: test args: --all check_fmt_and_docs: name: Checking fmt and docs runs-on: ubuntu-latest steps: - uses: actions/checkout@master - uses: actions-rs/toolchain@v1 with: toolchain: nightly components: rustfmt, clippy override: true - name: fmt run: cargo fmt --all -- --check - name: Docs run: cargo doc route-recognizer-0.3.1/.gitignore000064400000000000000000000000260072674642500151460ustar 00000000000000/target /Cargo.lock route-recognizer-0.3.1/Cargo.toml0000644000000013750000000000100123440ustar # THIS FILE IS AUTOMATICALLY GENERATED BY CARGO # # When uploading crates to the registry Cargo will automatically # "normalize" Cargo.toml files for maximal compatibility # with all versions of Cargo and also rewrite `path` dependencies # to registry (e.g., crates.io) dependencies. # # If you are reading this file be aware that the original Cargo.toml # will likely look very different (and much more reasonable). # See Cargo.toml.orig for the original contents. [package] edition = "2018" name = "route-recognizer" version = "0.3.1" authors = ["wycats", "rustasync"] description = "Recognizes URL patterns with support for dynamic and wildcard segments" keywords = ["router", "url"] license = "MIT" repository = "https://github.com/rustasync/route-recognizer" route-recognizer-0.3.1/Cargo.toml.orig000064400000000000000000000004650072674642500160540ustar 00000000000000[package] name = "route-recognizer" description = "Recognizes URL patterns with support for dynamic and wildcard segments" license = "MIT" repository = "https://github.com/rustasync/route-recognizer" keywords = ["router", "url"] edition = "2018" version = "0.3.1" authors = ["wycats", "rustasync"] route-recognizer-0.3.1/LICENSE-MIT000064400000000000000000000022470072674642500146210ustar 00000000000000The MIT License (MIT) Copyright (c) 2020 The http-rs contributors Copyright (c) 2019 The rustasync contributors Copyright (c) 2014 Yehuda Katz Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. route-recognizer-0.3.1/README.md000064400000000000000000000043350072674642500144440ustar 00000000000000

route-recognizer

Recognizes URL patterns with support for dynamic and wildcard segments

Crates.io version Download docs.rs docs

API Docs | Releases | Contributing

## Installation ```sh $ cargo add route-recognizer ``` ## Safety This crate uses ``#![deny(unsafe_code)]`` to ensure everything is implemented in 100% Safe Rust. ## Contributing Want to join us? Check out our ["Contributing" guide][contributing] and take a look at some of these issues: - [Issues labeled "good first issue"][good-first-issue] - [Issues labeled "help wanted"][help-wanted] [contributing]: https://github.com/http-rs/route-recognizer/blob/master.github/CONTRIBUTING.md [good-first-issue]: https://github.com/http-rs/route-recognizer/labels/good%20first%20issue [help-wanted]: https://github.com/http-rs/route-recognizer/labels/help%20wanted ## License Licensed under either the MIT license at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as MIT / Apache-2.0, without any additional terms or conditions. route-recognizer-0.3.1/benches/bench.rs000064400000000000000000000011520072674642500162130ustar 00000000000000#![feature(test)] extern crate route_recognizer; extern crate test; use route_recognizer::Router; #[bench] fn benchmark(b: &mut test::Bencher) { let mut router = Router::new(); router.add("/posts/:post_id/comments/:id", "comment".to_string()); router.add("/posts/:post_id/comments", "comments".to_string()); router.add("/posts/:post_id", "post".to_string()); router.add("/posts", "posts".to_string()); router.add("/comments", "comments2".to_string()); router.add("/comments/:id", "comment2".to_string()); b.iter(|| router.recognize("/posts/100/comments/200")); } route-recognizer-0.3.1/benches/nfa.rs000064400000000000000000000020500072674642500156760ustar 00000000000000#![feature(test)] extern crate route_recognizer; extern crate test; use route_recognizer::nfa::CharSet; use std::collections::{BTreeSet, HashSet}; #[bench] fn bench_char_set(b: &mut test::Bencher) { let mut set = CharSet::new(); set.insert('p'); set.insert('n'); set.insert('/'); b.iter(|| { assert!(set.contains('p')); assert!(set.contains('/')); assert!(!set.contains('z')); }); } #[bench] fn bench_hash_set(b: &mut test::Bencher) { let mut set = HashSet::new(); set.insert('p'); set.insert('n'); set.insert('/'); b.iter(|| { assert!(set.contains(&'p')); assert!(set.contains(&'/')); assert!(!set.contains(&'z')); }); } #[bench] fn bench_btree_set(b: &mut test::Bencher) { let mut set = BTreeSet::new(); set.insert('p'); set.insert('n'); set.insert('/'); b.iter(|| { assert!(set.contains(&'p')); assert!(set.contains(&'/')); assert!(!set.contains(&'z')); }); } route-recognizer-0.3.1/src/lib.rs000064400000000000000000000423510072674642500150700ustar 00000000000000//! Recognizes URL patterns with support for dynamic and wildcard segments //! //! # Examples //! //! ``` //! use route_recognizer::{Router, Params}; //! //! let mut router = Router::new(); //! //! router.add("/thomas", "Thomas".to_string()); //! router.add("/tom", "Tom".to_string()); //! router.add("/wycats", "Yehuda".to_string()); //! //! let m = router.recognize("/thomas").unwrap(); //! //! assert_eq!(m.handler().as_str(), "Thomas"); //! assert_eq!(m.params(), &Params::new()); //! ``` //! //! # Routing params //! //! The router supports four kinds of route segments: //! - __segments__: these are of the format `/a/b`. //! - __params__: these are of the format `/a/:b`. //! - __named wildcards__: these are of the format `/a/*b`. //! - __unnamed wildcards__: these are of the format `/a/*`. //! //! The difference between a "named wildcard" and a "param" is how the //! matching rules apply. Given the router `/a/:b`, passing in `/foo/bar/baz` //! will not match because `/baz` has no counterpart in the router. //! //! However if we define the route `/a/*b` and we pass `/foo/bar/baz` we end up //! with a named param `"b"` that contains the value `"bar/baz"`. Wildcard //! routing rules are useful when you don't know which routes may follow. The //! difference between "named" and "unnamed" wildcards is that the former will //! show up in `Params`, while the latter won't. #![cfg_attr(feature = "docs", feature(doc_cfg))] #![deny(unsafe_code)] #![deny(missing_debug_implementations, nonstandard_style)] #![warn(missing_docs, unreachable_pub, future_incompatible, rust_2018_idioms)] #![doc(test(attr(deny(warnings))))] #![doc(test(attr(allow(unused_extern_crates, unused_variables))))] #![doc(html_favicon_url = "https://yoshuawuyts.com/assets/http-rs/favicon.ico")] #![doc(html_logo_url = "https://yoshuawuyts.com/assets/http-rs/logo-rounded.png")] use std::cmp::Ordering; use std::collections::{btree_map, BTreeMap}; use std::ops::Index; use crate::nfa::{CharacterClass, NFA}; #[doc(hidden)] pub mod nfa; #[derive(Clone, Eq, Debug)] struct Metadata { statics: u32, dynamics: u32, wildcards: u32, param_names: Vec, } impl Metadata { pub(crate) fn new() -> Self { Self { statics: 0, dynamics: 0, wildcards: 0, param_names: Vec::new(), } } } impl Ord for Metadata { fn cmp(&self, other: &Self) -> Ordering { if self.statics > other.statics { Ordering::Greater } else if self.statics < other.statics { Ordering::Less } else if self.dynamics > other.dynamics { Ordering::Greater } else if self.dynamics < other.dynamics { Ordering::Less } else if self.wildcards > other.wildcards { Ordering::Greater } else if self.wildcards < other.wildcards { Ordering::Less } else { Ordering::Equal } } } impl PartialOrd for Metadata { fn partial_cmp(&self, other: &Self) -> Option { Some(self.cmp(other)) } } impl PartialEq for Metadata { fn eq(&self, other: &Self) -> bool { self.statics == other.statics && self.dynamics == other.dynamics && self.wildcards == other.wildcards } } /// Router parameters. #[derive(PartialEq, Clone, Debug, Default)] pub struct Params { map: BTreeMap, } impl Params { /// Create a new instance of `Params`. pub fn new() -> Self { Self { map: BTreeMap::new(), } } /// Insert a new param into `Params`. pub fn insert(&mut self, key: String, value: String) { self.map.insert(key, value); } /// Find a param by name in `Params`. pub fn find(&self, key: &str) -> Option<&str> { self.map.get(key).map(|s| &s[..]) } /// Iterate over all named params. /// /// This will return all named params and named wildcards. pub fn iter(&self) -> Iter<'_> { Iter(self.map.iter()) } } impl Index<&str> for Params { type Output = String; fn index(&self, index: &str) -> &String { match self.map.get(index) { None => panic!("params[{}] did not exist", index), Some(s) => s, } } } impl<'a> IntoIterator for &'a Params { type IntoIter = Iter<'a>; type Item = (&'a str, &'a str); fn into_iter(self) -> Iter<'a> { self.iter() } } /// An iterator over `Params`. #[derive(Debug)] pub struct Iter<'a>(btree_map::Iter<'a, String, String>); impl<'a> Iterator for Iter<'a> { type Item = (&'a str, &'a str); #[inline] fn next(&mut self) -> Option<(&'a str, &'a str)> { self.0.next().map(|(k, v)| (&**k, &**v)) } fn size_hint(&self) -> (usize, Option) { self.0.size_hint() } } /// The result of a successful match returned by `Router::recognize`. #[derive(Debug)] pub struct Match { /// Return the endpoint handler. handler: T, /// Return the params. params: Params, } impl Match { /// Create a new instance of `Match`. pub fn new(handler: T, params: Params) -> Self { Self { handler, params } } /// Get a handle to the handler. pub fn handler(&self) -> &T { &self.handler } /// Get a mutable handle to the handler. pub fn handler_mut(&mut self) -> &mut T { &mut self.handler } /// Get a handle to the params. pub fn params(&self) -> &Params { &self.params } /// Get a mutable handle to the params. pub fn params_mut(&mut self) -> &mut Params { &mut self.params } } /// Recognizes URL patterns with support for dynamic and wildcard segments. #[derive(Clone, Debug)] pub struct Router { nfa: NFA, handlers: BTreeMap, } fn segments(route: &str) -> Vec<(Option, &str)> { let predicate = |c| c == '.' || c == '/'; let mut segments = vec![]; let mut segment_start = 0; while segment_start < route.len() { let segment_end = route[segment_start + 1..] .find(predicate) .map(|i| i + segment_start + 1) .unwrap_or_else(|| route.len()); let potential_sep = route.chars().nth(segment_start); let sep_and_segment = match potential_sep { Some(sep) if predicate(sep) => (Some(sep), &route[segment_start + 1..segment_end]), _ => (None, &route[segment_start..segment_end]), }; segments.push(sep_and_segment); segment_start = segment_end; } segments } impl Router { /// Create a new instance of `Router`. pub fn new() -> Self { Self { nfa: NFA::new(), handlers: BTreeMap::new(), } } /// Add a route to the router. pub fn add(&mut self, mut route: &str, dest: T) { if !route.is_empty() && route.as_bytes()[0] == b'/' { route = &route[1..]; } let nfa = &mut self.nfa; let mut state = 0; let mut metadata = Metadata::new(); for (separator, segment) in segments(route) { if let Some(separator) = separator { state = nfa.put(state, CharacterClass::valid_char(separator)); } if !segment.is_empty() && segment.as_bytes()[0] == b':' { state = process_dynamic_segment(nfa, state); metadata.dynamics += 1; metadata.param_names.push(segment[1..].to_string()); } else if !segment.is_empty() && segment.as_bytes()[0] == b'*' { state = process_star_state(nfa, state); metadata.wildcards += 1; metadata.param_names.push(segment[1..].to_string()); } else { state = process_static_segment(segment, nfa, state); metadata.statics += 1; } } nfa.acceptance(state); nfa.metadata(state, metadata); self.handlers.insert(state, dest); } /// Match a route on the router. pub fn recognize(&self, mut path: &str) -> Result, String> { if !path.is_empty() && path.as_bytes()[0] == b'/' { path = &path[1..]; } let nfa = &self.nfa; let result = nfa.process(path, |index| nfa.get(index).metadata.as_ref().unwrap()); match result { Ok(nfa_match) => { let mut map = Params::new(); let state = &nfa.get(nfa_match.state); let metadata = state.metadata.as_ref().unwrap(); let param_names = metadata.param_names.clone(); for (i, capture) in nfa_match.captures.iter().enumerate() { if !param_names[i].is_empty() { map.insert(param_names[i].to_string(), capture.to_string()); } } let handler = self.handlers.get(&nfa_match.state).unwrap(); Ok(Match::new(handler, map)) } Err(str) => Err(str), } } } impl Default for Router { fn default() -> Self { Self::new() } } fn process_static_segment(segment: &str, nfa: &mut NFA, mut state: usize) -> usize { for char in segment.chars() { state = nfa.put(state, CharacterClass::valid_char(char)); } state } fn process_dynamic_segment(nfa: &mut NFA, mut state: usize) -> usize { state = nfa.put(state, CharacterClass::invalid_char('/')); nfa.put_state(state, state); nfa.start_capture(state); nfa.end_capture(state); state } fn process_star_state(nfa: &mut NFA, mut state: usize) -> usize { state = nfa.put(state, CharacterClass::any()); nfa.put_state(state, state); nfa.start_capture(state); nfa.end_capture(state); state } #[cfg(test)] mod tests { use super::{Params, Router}; #[test] fn basic_router() { let mut router = Router::new(); router.add("/thomas", "Thomas".to_string()); router.add("/tom", "Tom".to_string()); router.add("/wycats", "Yehuda".to_string()); let m = router.recognize("/thomas").unwrap(); assert_eq!(*m.handler, "Thomas".to_string()); assert_eq!(m.params, Params::new()); } #[test] fn root_router() { let mut router = Router::new(); router.add("/", 10); assert_eq!(*router.recognize("/").unwrap().handler, 10) } #[test] fn empty_path() { let mut router = Router::new(); router.add("/", 12); assert_eq!(*router.recognize("").unwrap().handler, 12) } #[test] fn empty_route() { let mut router = Router::new(); router.add("", 12); assert_eq!(*router.recognize("/").unwrap().handler, 12) } #[test] fn ambiguous_router() { let mut router = Router::new(); router.add("/posts/new", "new".to_string()); router.add("/posts/:id", "id".to_string()); let id = router.recognize("/posts/1").unwrap(); assert_eq!(*id.handler, "id".to_string()); assert_eq!(id.params, params("id", "1")); let new = router.recognize("/posts/new").unwrap(); assert_eq!(*new.handler, "new".to_string()); assert_eq!(new.params, Params::new()); } #[test] fn ambiguous_router_b() { let mut router = Router::new(); router.add("/posts/:id", "id".to_string()); router.add("/posts/new", "new".to_string()); let id = router.recognize("/posts/1").unwrap(); assert_eq!(*id.handler, "id".to_string()); assert_eq!(id.params, params("id", "1")); let new = router.recognize("/posts/new").unwrap(); assert_eq!(*new.handler, "new".to_string()); assert_eq!(new.params, Params::new()); } #[test] fn multiple_params() { let mut router = Router::new(); router.add("/posts/:post_id/comments/:id", "comment".to_string()); router.add("/posts/:post_id/comments", "comments".to_string()); let com = router.recognize("/posts/12/comments/100").unwrap(); let coms = router.recognize("/posts/12/comments").unwrap(); assert_eq!(*com.handler, "comment".to_string()); assert_eq!(com.params, two_params("post_id", "12", "id", "100")); assert_eq!(*coms.handler, "comments".to_string()); assert_eq!(coms.params, params("post_id", "12")); assert_eq!(coms.params["post_id"], "12".to_string()); } #[test] fn wildcard() { let mut router = Router::new(); router.add("*foo", "test".to_string()); router.add("/bar/*foo", "test2".to_string()); let m = router.recognize("/test").unwrap(); assert_eq!(*m.handler, "test".to_string()); assert_eq!(m.params, params("foo", "test")); let m = router.recognize("/foo/bar").unwrap(); assert_eq!(*m.handler, "test".to_string()); assert_eq!(m.params, params("foo", "foo/bar")); let m = router.recognize("/bar/foo").unwrap(); assert_eq!(*m.handler, "test2".to_string()); assert_eq!(m.params, params("foo", "foo")); } #[test] fn wildcard_colon() { let mut router = Router::new(); router.add("/a/*b", "ab".to_string()); router.add("/a/*b/c", "abc".to_string()); router.add("/a/*b/c/:d", "abcd".to_string()); let m = router.recognize("/a/foo").unwrap(); assert_eq!(*m.handler, "ab".to_string()); assert_eq!(m.params, params("b", "foo")); let m = router.recognize("/a/foo/bar").unwrap(); assert_eq!(*m.handler, "ab".to_string()); assert_eq!(m.params, params("b", "foo/bar")); let m = router.recognize("/a/foo/c").unwrap(); assert_eq!(*m.handler, "abc".to_string()); assert_eq!(m.params, params("b", "foo")); let m = router.recognize("/a/foo/bar/c").unwrap(); assert_eq!(*m.handler, "abc".to_string()); assert_eq!(m.params, params("b", "foo/bar")); let m = router.recognize("/a/foo/c/baz").unwrap(); assert_eq!(*m.handler, "abcd".to_string()); assert_eq!(m.params, two_params("b", "foo", "d", "baz")); let m = router.recognize("/a/foo/bar/c/baz").unwrap(); assert_eq!(*m.handler, "abcd".to_string()); assert_eq!(m.params, two_params("b", "foo/bar", "d", "baz")); let m = router.recognize("/a/foo/bar/c/baz/bay").unwrap(); assert_eq!(*m.handler, "ab".to_string()); assert_eq!(m.params, params("b", "foo/bar/c/baz/bay")); } #[test] fn unnamed_parameters() { let mut router = Router::new(); router.add("/foo/:/bar", "test".to_string()); router.add("/foo/:bar/*", "test2".to_string()); let m = router.recognize("/foo/test/bar").unwrap(); assert_eq!(*m.handler, "test"); assert_eq!(m.params, Params::new()); let m = router.recognize("/foo/test/blah").unwrap(); assert_eq!(*m.handler, "test2"); assert_eq!(m.params, params("bar", "test")); } fn params(key: &str, val: &str) -> Params { let mut map = Params::new(); map.insert(key.to_string(), val.to_string()); map } fn two_params(k1: &str, v1: &str, k2: &str, v2: &str) -> Params { let mut map = Params::new(); map.insert(k1.to_string(), v1.to_string()); map.insert(k2.to_string(), v2.to_string()); map } #[test] fn dot() { let mut router = Router::new(); router.add("/1/baz.:wibble", ()); router.add("/2/:bar.baz", ()); router.add("/3/:dynamic.:extension", ()); router.add("/4/static.static", ()); let m = router.recognize("/1/baz.jpg").unwrap(); assert_eq!(m.params, params("wibble", "jpg")); let m = router.recognize("/2/test.baz").unwrap(); assert_eq!(m.params, params("bar", "test")); let m = router.recognize("/3/any.thing").unwrap(); assert_eq!(m.params, two_params("dynamic", "any", "extension", "thing")); let m = router.recognize("/3/this.performs.a.greedy.match").unwrap(); assert_eq!( m.params, two_params("dynamic", "this.performs.a.greedy", "extension", "match") ); let m = router.recognize("/4/static.static").unwrap(); assert_eq!(m.params, Params::new()); let m = router.recognize("/4/static/static"); assert!(m.is_err()); let m = router.recognize("/4.static.static"); assert!(m.is_err()); } #[test] fn test_chinese() { let mut router = Router::new(); router.add("/crates/:foo/:bar", "Hello".to_string()); let m = router.recognize("/crates/实打实打算/d's'd").unwrap(); assert_eq!(m.handler().as_str(), "Hello"); assert_eq!(m.params().find("foo"), Some("实打实打算")); assert_eq!(m.params().find("bar"), Some("d's'd")); } } route-recognizer-0.3.1/src/nfa.rs000064400000000000000000000425650072674642500150750ustar 00000000000000use std::collections::HashSet; use self::CharacterClass::{Ascii, InvalidChars, ValidChars}; #[derive(PartialEq, Eq, Clone, Default, Debug)] pub struct CharSet { low_mask: u64, high_mask: u64, non_ascii: HashSet, } impl CharSet { pub fn new() -> Self { Self { low_mask: 0, high_mask: 0, non_ascii: HashSet::new(), } } pub fn insert(&mut self, char: char) { let val = char as u32 - 1; if val > 127 { self.non_ascii.insert(char); } else if val > 63 { let bit = 1 << (val - 64); self.high_mask |= bit; } else { let bit = 1 << val; self.low_mask |= bit; } } pub fn contains(&self, char: char) -> bool { let val = char as u32 - 1; if val > 127 { self.non_ascii.contains(&char) } else if val > 63 { let bit = 1 << (val - 64); self.high_mask & bit != 0 } else { let bit = 1 << val; self.low_mask & bit != 0 } } } #[derive(PartialEq, Eq, Clone, Debug)] pub enum CharacterClass { Ascii(u64, u64, bool), ValidChars(CharSet), InvalidChars(CharSet), } impl CharacterClass { pub fn any() -> Self { Ascii(u64::max_value(), u64::max_value(), true) } pub fn valid(string: &str) -> Self { ValidChars(Self::str_to_set(string)) } pub fn invalid(string: &str) -> Self { InvalidChars(Self::str_to_set(string)) } pub fn valid_char(char: char) -> Self { let val = char as u32 - 1; if val > 127 { ValidChars(Self::char_to_set(char)) } else if val > 63 { Ascii(1 << (val - 64), 0, false) } else { Ascii(0, 1 << val, false) } } pub fn invalid_char(char: char) -> Self { let val = char as u32 - 1; if val > 127 { InvalidChars(Self::char_to_set(char)) } else if val > 63 { Ascii(u64::max_value() ^ (1 << (val - 64)), u64::max_value(), true) } else { Ascii(u64::max_value(), u64::max_value() ^ (1 << val), true) } } pub fn matches(&self, char: char) -> bool { match *self { ValidChars(ref valid) => valid.contains(char), InvalidChars(ref invalid) => !invalid.contains(char), Ascii(high, low, unicode) => { let val = char as u32 - 1; if val > 127 { unicode } else if val > 63 { high & (1 << (val - 64)) != 0 } else { low & (1 << val) != 0 } } } } fn char_to_set(char: char) -> CharSet { let mut set = CharSet::new(); set.insert(char); set } fn str_to_set(string: &str) -> CharSet { let mut set = CharSet::new(); for char in string.chars() { set.insert(char); } set } } #[derive(Clone)] struct Thread { state: usize, captures: Vec<(usize, usize)>, capture_begin: Option, } impl Thread { pub(crate) fn new() -> Self { Self { state: 0, captures: Vec::new(), capture_begin: None, } } #[inline] pub(crate) fn start_capture(&mut self, start: usize) { self.capture_begin = Some(start); } #[inline] pub(crate) fn end_capture(&mut self, end: usize) { self.captures.push((self.capture_begin.unwrap(), end)); self.capture_begin = None; } pub(crate) fn extract<'a>(&self, source: &'a str) -> Vec<&'a str> { self.captures .iter() .map(|&(begin, end)| &source[begin..end]) .collect() } } #[derive(Clone, Debug)] pub struct State { pub index: usize, pub chars: CharacterClass, pub next_states: Vec, pub acceptance: bool, pub start_capture: bool, pub end_capture: bool, pub metadata: Option, } impl PartialEq for State { fn eq(&self, other: &Self) -> bool { self.index == other.index } } impl State { pub fn new(index: usize, chars: CharacterClass) -> Self { Self { index, chars, next_states: Vec::new(), acceptance: false, start_capture: false, end_capture: false, metadata: None, } } } #[derive(Debug)] pub struct Match<'a> { pub state: usize, pub captures: Vec<&'a str>, } impl<'a> Match<'a> { pub fn new(state: usize, captures: Vec<&'_ str>) -> Match<'_> { Match { state, captures } } } #[derive(Clone, Default, Debug)] pub struct NFA { states: Vec>, start_capture: Vec, end_capture: Vec, acceptance: Vec, } impl NFA { pub fn new() -> Self { let root = State::new(0, CharacterClass::any()); Self { states: vec![root], start_capture: vec![false], end_capture: vec![false], acceptance: vec![false], } } pub fn process<'a, I, F>(&self, string: &'a str, mut ord: F) -> Result, String> where I: Ord, F: FnMut(usize) -> I, { let mut threads = vec![Thread::new()]; for (i, char) in string.char_indices() { let next_threads = self.process_char(threads, char, i); if next_threads.is_empty() { return Err(format!("Couldn't process {}", string)); } threads = next_threads; } let returned = threads .into_iter() .filter(|thread| self.get(thread.state).acceptance); let thread = returned .fold(None, |prev, y| { let y_v = ord(y.state); match prev { None => Some((y_v, y)), Some((x_v, x)) => { if x_v < y_v { Some((y_v, y)) } else { Some((x_v, x)) } } } }) .map(|p| p.1); match thread { None => Err("The string was exhausted before reaching an \ acceptance state" .to_string()), Some(mut thread) => { if thread.capture_begin.is_some() { thread.end_capture(string.len()); } let state = self.get(thread.state); Ok(Match::new(state.index, thread.extract(string))) } } } #[inline] fn process_char(&self, threads: Vec, char: char, pos: usize) -> Vec { let mut returned = Vec::with_capacity(threads.len()); for mut thread in threads { let current_state = self.get(thread.state); let mut count = 0; let mut found_state = 0; for &index in ¤t_state.next_states { let state = &self.states[index]; if state.chars.matches(char) { count += 1; found_state = index; } } if count == 1 { thread.state = found_state; capture(self, &mut thread, current_state.index, found_state, pos); returned.push(thread); continue; } for &index in ¤t_state.next_states { let state = &self.states[index]; if state.chars.matches(char) { let mut thread = fork_thread(&thread, state); capture(self, &mut thread, current_state.index, index, pos); returned.push(thread); } } } returned } #[inline] pub fn get(&self, state: usize) -> &State { &self.states[state] } pub fn get_mut(&mut self, state: usize) -> &mut State { &mut self.states[state] } pub fn put(&mut self, index: usize, chars: CharacterClass) -> usize { { let state = self.get(index); for &index in &state.next_states { let state = self.get(index); if state.chars == chars { return index; } } } let state = self.new_state(chars); self.get_mut(index).next_states.push(state); state } pub fn put_state(&mut self, index: usize, child: usize) { if !self.states[index].next_states.contains(&child) { self.get_mut(index).next_states.push(child); } } pub fn acceptance(&mut self, index: usize) { self.get_mut(index).acceptance = true; self.acceptance[index] = true; } pub fn start_capture(&mut self, index: usize) { self.get_mut(index).start_capture = true; self.start_capture[index] = true; } pub fn end_capture(&mut self, index: usize) { self.get_mut(index).end_capture = true; self.end_capture[index] = true; } pub fn metadata(&mut self, index: usize, metadata: T) { self.get_mut(index).metadata = Some(metadata); } fn new_state(&mut self, chars: CharacterClass) -> usize { let index = self.states.len(); let state = State::new(index, chars); self.states.push(state); self.acceptance.push(false); self.start_capture.push(false); self.end_capture.push(false); index } } #[inline] fn fork_thread(thread: &Thread, state: &State) -> Thread { let mut new_trace = thread.clone(); new_trace.state = state.index; new_trace } #[inline] fn capture( nfa: &NFA, thread: &mut Thread, current_state: usize, next_state: usize, pos: usize, ) { if thread.capture_begin == None && nfa.start_capture[next_state] { thread.start_capture(pos); } if thread.capture_begin != None && nfa.end_capture[current_state] && next_state > current_state { thread.end_capture(pos); } } #[cfg(test)] mod tests { use super::{CharSet, CharacterClass, NFA}; #[test] fn basic_test() { let mut nfa = NFA::<()>::new(); let a = nfa.put(0, CharacterClass::valid("h")); let b = nfa.put(a, CharacterClass::valid("e")); let c = nfa.put(b, CharacterClass::valid("l")); let d = nfa.put(c, CharacterClass::valid("l")); let e = nfa.put(d, CharacterClass::valid("o")); nfa.acceptance(e); let m = nfa.process("hello", |a| a); assert!( m.unwrap().state == e, "You didn't get the right final state" ); } #[test] fn multiple_solutions() { let mut nfa = NFA::<()>::new(); let a1 = nfa.put(0, CharacterClass::valid("n")); let b1 = nfa.put(a1, CharacterClass::valid("e")); let c1 = nfa.put(b1, CharacterClass::valid("w")); nfa.acceptance(c1); let a2 = nfa.put(0, CharacterClass::invalid("")); let b2 = nfa.put(a2, CharacterClass::invalid("")); let c2 = nfa.put(b2, CharacterClass::invalid("")); nfa.acceptance(c2); let m = nfa.process("new", |a| a); assert!(m.unwrap().state == c2, "The two states were not found"); } #[test] fn multiple_paths() { let mut nfa = NFA::<()>::new(); let a = nfa.put(0, CharacterClass::valid("t")); // t let b1 = nfa.put(a, CharacterClass::valid("h")); // th let c1 = nfa.put(b1, CharacterClass::valid("o")); // tho let d1 = nfa.put(c1, CharacterClass::valid("m")); // thom let e1 = nfa.put(d1, CharacterClass::valid("a")); // thoma let f1 = nfa.put(e1, CharacterClass::valid("s")); // thomas let b2 = nfa.put(a, CharacterClass::valid("o")); // to let c2 = nfa.put(b2, CharacterClass::valid("m")); // tom nfa.acceptance(f1); nfa.acceptance(c2); let thomas = nfa.process("thomas", |a| a); let tom = nfa.process("tom", |a| a); let thom = nfa.process("thom", |a| a); let nope = nfa.process("nope", |a| a); assert!(thomas.unwrap().state == f1, "thomas was parsed correctly"); assert!(tom.unwrap().state == c2, "tom was parsed correctly"); assert!(thom.is_err(), "thom didn't reach an acceptance state"); assert!(nope.is_err(), "nope wasn't parsed"); } #[test] fn repetitions() { let mut nfa = NFA::<()>::new(); let a = nfa.put(0, CharacterClass::valid("p")); // p let b = nfa.put(a, CharacterClass::valid("o")); // po let c = nfa.put(b, CharacterClass::valid("s")); // pos let d = nfa.put(c, CharacterClass::valid("t")); // post let e = nfa.put(d, CharacterClass::valid("s")); // posts let f = nfa.put(e, CharacterClass::valid("/")); // posts/ let g = nfa.put(f, CharacterClass::invalid("/")); // posts/[^/] nfa.put_state(g, g); nfa.acceptance(g); let post = nfa.process("posts/1", |a| a); let new_post = nfa.process("posts/new", |a| a); let invalid = nfa.process("posts/", |a| a); assert!(post.unwrap().state == g, "posts/1 was parsed"); assert!(new_post.unwrap().state == g, "posts/new was parsed"); assert!(invalid.is_err(), "posts/ was invalid"); } #[test] fn repetitions_with_ambiguous() { let mut nfa = NFA::<()>::new(); let a = nfa.put(0, CharacterClass::valid("p")); // p let b = nfa.put(a, CharacterClass::valid("o")); // po let c = nfa.put(b, CharacterClass::valid("s")); // pos let d = nfa.put(c, CharacterClass::valid("t")); // post let e = nfa.put(d, CharacterClass::valid("s")); // posts let f = nfa.put(e, CharacterClass::valid("/")); // posts/ let g1 = nfa.put(f, CharacterClass::invalid("/")); // posts/[^/] let g2 = nfa.put(f, CharacterClass::valid("n")); // posts/n let h2 = nfa.put(g2, CharacterClass::valid("e")); // posts/ne let i2 = nfa.put(h2, CharacterClass::valid("w")); // posts/new nfa.put_state(g1, g1); nfa.acceptance(g1); nfa.acceptance(i2); let post = nfa.process("posts/1", |a| a); let ambiguous = nfa.process("posts/new", |a| a); let invalid = nfa.process("posts/", |a| a); assert!(post.unwrap().state == g1, "posts/1 was parsed"); assert!(ambiguous.unwrap().state == i2, "posts/new was ambiguous"); assert!(invalid.is_err(), "posts/ was invalid"); } #[test] fn captures() { let mut nfa = NFA::<()>::new(); let a = nfa.put(0, CharacterClass::valid("n")); let b = nfa.put(a, CharacterClass::valid("e")); let c = nfa.put(b, CharacterClass::valid("w")); nfa.acceptance(c); nfa.start_capture(a); nfa.end_capture(c); let post = nfa.process("new", |a| a); assert_eq!(post.unwrap().captures, vec!["new"]); } #[test] fn capture_mid_match() { let mut nfa = NFA::<()>::new(); let a = nfa.put(0, valid('p')); let b = nfa.put(a, valid('/')); let c = nfa.put(b, invalid('/')); let d = nfa.put(c, valid('/')); let e = nfa.put(d, valid('c')); nfa.put_state(c, c); nfa.acceptance(e); nfa.start_capture(c); nfa.end_capture(c); let post = nfa.process("p/123/c", |a| a); assert_eq!(post.unwrap().captures, vec!["123"]); } #[test] fn capture_multiple_captures() { let mut nfa = NFA::<()>::new(); let a = nfa.put(0, valid('p')); let b = nfa.put(a, valid('/')); let c = nfa.put(b, invalid('/')); let d = nfa.put(c, valid('/')); let e = nfa.put(d, valid('c')); let f = nfa.put(e, valid('/')); let g = nfa.put(f, invalid('/')); nfa.put_state(c, c); nfa.put_state(g, g); nfa.acceptance(g); nfa.start_capture(c); nfa.end_capture(c); nfa.start_capture(g); nfa.end_capture(g); let post = nfa.process("p/123/c/456", |a| a); assert_eq!(post.unwrap().captures, vec!["123", "456"]); } #[test] fn test_ascii_set() { let mut set = CharSet::new(); set.insert('?'); set.insert('a'); set.insert('é'); assert!(set.contains('?'), "The set contains char 63"); assert!(set.contains('a'), "The set contains char 97"); assert!(set.contains('é'), "The set contains char 233"); assert!(!set.contains('q'), "The set does not contain q"); assert!(!set.contains('ü'), "The set does not contain ü"); } fn valid(char: char) -> CharacterClass { CharacterClass::valid_char(char) } fn invalid(char: char) -> CharacterClass { CharacterClass::invalid_char(char) } }