mini_histogram-0.3.1/0000755000004100000410000000000013761065275014570 5ustar www-datawww-datamini_histogram-0.3.1/.travis.yml0000644000004100000410000000012113761065275016673 0ustar www-datawww-data--- language: ruby cache: bundler rvm: - 2.1 - 2.2 - 2.5 - 2.6 - 2.7.0 mini_histogram-0.3.1/README.md0000644000004100000410000002505313761065275016054 0ustar www-datawww-data# MiniHistogram [![Build Status](https://travis-ci.org/zombocom/mini_histogram.svg?branch=master)](https://travis-ci.org/zombocom/mini_histogram) What's a histogram and why should you care? First read [Lies, Damned Lies, and Averages: Perc50, Perc95 explained for Programmers](https://schneems.com/2020/03/17/lies-damned-lies-and-averages-perc50-perc95-explained-for-programmers/). This library lets you build histograms in pure Ruby. ## Installation Add this line to your application's Gemfile: ```ruby gem 'mini_histogram' ``` And then execute: $ bundle install Or install it yourself as: $ gem install mini_histogram ## Usage Given an array, this class calculates the "edges" of a histogram these edges mark the boundries for "bins" ```ruby array = [1,1,1, 5, 5, 5, 5, 10, 10, 10] histogram = MiniHistogram.new(array) puts histogram.edges # => [0.0, 2.0, 4.0, 6.0, 8.0, 10.0, 12.0] ``` It also finds the weights (aka count of values) that would go in each bin: ``` puts histogram.weights # => [3, 0, 4, 0, 0, 3] ``` This means that the `array` here had three items between 0.0 and 2.0, four items between 4.0 and 6.0 and three items between 10.0 and 12.0 ## Plotting [experimental] You can plot! ```ruby require 'mini_histogram/plot' array = 50.times.map { rand(11.2..11.6) } histogram = MiniHistogram.new(array) puts histogram.plot ``` Will generate: ``` ┌ ┐ [11.2 , 11.25) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 9 [11.25, 11.3 ) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 6 [11.3 , 11.35) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇ 4 [11.35, 11.4 ) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇ 4 [11.4 , 11.45) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 11 [11.45, 11.5 ) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 5 [11.5 , 11.55) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 7 [11.55, 11.6 ) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇ 4 └ ┘ Frequency ``` Integrated plotting is an experimental currently, use with some caution. If you are on Ruby 2.4+ you can pass an instance of MiniHistogram to [unicode_plot.rb](https://github.com/red-data-tools/unicode_plot.rb): ```ruby array = 50.times.map { rand(11.2..11.6) } histogram = MiniHistogram.new(array) puts UnicodePlot.histogram(histogram) ``` ## Plotting dualing histograms [experimental] If you're plotting multiple histograms (first, please normalize the bucket sizes), second. It can be hard to compare them vertically. Here's an example: ``` ┌ ┐ [11.2 , 11.28) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 12 [11.28, 11.36) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 22 [11.35, 11.43) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 30 [11.43, 11.51) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 17 [11.5 , 11.58) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 13 [11.58, 11.66) ┤▇▇▇▇▇▇▇ 6 [11.65, 11.73) ┤ 0 [11.73, 11.81) ┤ 0 [11.8 , 11.88) ┤ 0 └ ┘ Frequency ┌ ┐ [11.2 , 11.28) ┤▇▇▇▇ 3 [11.28, 11.36) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 [11.35, 11.43) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 17 [11.43, 11.51) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 25 [11.5 , 11.58) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 15 [11.58, 11.66) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 13 [11.65, 11.73) ┤▇▇▇▇ 3 [11.73, 11.81) ┤▇▇▇▇ 3 [11.8 , 11.88) ┤▇▇▇ 2 └ ┘ Frequency ``` Here's the same data set plotted side-by-side: ``` ┌ ┐ ┌ ┐ [11.2 , 11.28) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 12 [11.2 , 11.28) ┤▇▇▇▇ 3 [11.28, 11.36) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 22 [11.28, 11.36) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 [11.35, 11.43) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 30 [11.35, 11.43) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 17 [11.43, 11.51) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 17 [11.43, 11.51) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 25 [11.5 , 11.58) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 13 [11.5 , 11.58) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 15 [11.58, 11.66) ┤▇▇▇▇▇▇▇ 6 [11.58, 11.66) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 13 [11.65, 11.73) ┤ 0 [11.65, 11.73) ┤▇▇▇▇ 3 [11.73, 11.81) ┤ 0 [11.73, 11.81) ┤▇▇▇▇ 3 [11.8 , 11.88) ┤ 0 [11.8 , 11.88) ┤▇▇▇ 2 └ ┘ └ ┘ Frequency Frequency ``` This method might require more scrolling in the github issue, but makes it easier to compare two distributions. Here's how you plot dualing histograms: ```ruby require 'mini_histogram/plot' a = MiniHistogram.new [11.205184, 11.223665, 11.228286, 11.23219, 11.233325, 11.234516, 11.245781, 11.248441, 11.250758, 11.255686, 11.265876, 11.26641, 11.279456, 11.281067, 11.284281, 11.287656, 11.289316, 11.289682, 11.292289, 11.294518, 11.296454, 11.299277, 11.305801, 11.306602, 11.309311, 11.318465, 11.318477, 11.322258, 11.328267, 11.334188, 11.339722, 11.340585, 11.346084, 11.346197, 11.351863, 11.35982, 11.362358, 11.364476, 11.365743, 11.368492, 11.368566, 11.36869, 11.37268, 11.374204, 11.374217, 11.374955, 11.376422, 11.377989, 11.383357, 11.383593, 11.385184, 11.394766, 11.395829, 11.398455, 11.399739, 11.401304, 11.411387, 11.411978, 11.413585, 11.413659, 11.418504, 11.419194, 11.419415, 11.421374, 11.4261, 11.427901, 11.429651, 11.434272, 11.435012, 11.440848, 11.447495, 11.456107, 11.457434, 11.467112, 11.471005, 11.473235, 11.485025, 11.485852, 11.488256, 11.488275, 11.499545, 11.509588, 11.51378, 11.51544, 11.520783, 11.52246, 11.522855, 11.5322, 11.533764, 11.544047, 11.552597, 11.558062, 11.567239, 11.569749, 11.575796, 11.588014, 11.614032, 11.615062, 11.618194, 11.635267] b = MiniHistogram.new [11.233813, 11.240717, 11.254617, 11.282013, 11.290658, 11.303213, 11.305237, 11.305299, 11.306397, 11.313867, 11.31397, 11.314444, 11.318032, 11.328111, 11.330127, 11.333235, 11.33678, 11.337799, 11.343758, 11.347798, 11.347915, 11.349594, 11.358198, 11.358507, 11.3628, 11.366111, 11.374993, 11.378195, 11.38166, 11.384867, 11.385235, 11.395825, 11.404434, 11.406065, 11.406677, 11.410244, 11.414527, 11.421267, 11.424535, 11.427231, 11.427869, 11.428548, 11.432594, 11.433524, 11.434903, 11.437769, 11.439761, 11.443437, 11.443846, 11.451106, 11.458503, 11.462256, 11.462324, 11.464342, 11.464716, 11.46477, 11.465271, 11.466843, 11.468789, 11.475492, 11.488113, 11.489616, 11.493736, 11.496842, 11.502074, 11.511367, 11.512634, 11.515562, 11.525771, 11.531415, 11.535379, 11.53966, 11.540969, 11.541265, 11.541978, 11.545301, 11.545533, 11.545701, 11.572584, 11.578881, 11.580701, 11.580922, 11.588731, 11.594082, 11.595915, 11.613622, 11.619884, 11.632889, 11.64377, 11.645225, 11.647167, 11.648257, 11.667158, 11.670378, 11.681261, 11.734586, 11.747066, 11.792425, 11.808377, 11.812346] dual_histogram = MiniHistogram.dual_plot do |x, y| x.histogram = a x.options = {} y.histogram = b y.options = {} end puts dual_histogram ``` ## Alternatives Alternatives to this gem include https://github.com/mrkn/enumerable-statistics/. I needed this gem to be able to calculate a "shared" or "average" edge value as seen in this PR https://github.com/mrkn/enumerable-statistics/pull/23. So that I could add histograms to derailed benchmarks: https://github.com/schneems/derailed_benchmarks/pull/169. This gem provides a `MiniHistogram.set_average_edges!` method to help there. Also this gem does not require a native extension compilation (faster to install, but performance is slower), and this gem does not extend or monkeypatch an core classes. [MiniHistogram API Docs](https://rubydoc.info/github/zombocom/mini_histogram/master/MiniHistogram) ## Development After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment. To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org). ## Contributing Bug reports and pull requests are welcome on GitHub at https://github.com/zombocom/mini_histogram. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/zombocom/mini_histogram/blob/master/CODE_OF_CONDUCT.md). ## License The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT). ## Code of Conduct Everyone interacting in the MiniHistogram project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/zombocom/mini_histogram/blob/master/CODE_OF_CONDUCT.md). mini_histogram-0.3.1/bin/0000755000004100000410000000000013761065275015340 5ustar www-datawww-datamini_histogram-0.3.1/bin/console0000755000004100000410000000053513761065275016733 0ustar www-datawww-data#!/usr/bin/env ruby require "bundler/setup" require "mini_histogram" # You can add fixtures and/or initialization code here to make experimenting # with your gem easier. You can also use a different console, if you like. # (If you use this, don't forget to add pry to your Gemfile!) # require "pry" # Pry.start require "irb" IRB.start(__FILE__) mini_histogram-0.3.1/bin/setup0000755000004100000410000000020313761065275016421 0ustar www-datawww-data#!/usr/bin/env bash set -euo pipefail IFS=$'\n\t' set -vx bundle install # Do any other automated setup that you need to do here mini_histogram-0.3.1/CHANGELOG.md0000644000004100000410000000145513761065275016406 0ustar www-datawww-data## HEAD ## 0.3.1 - Add missing require for stringio (https://github.com/zombocom/mini_histogram/pull/7) ## 0.3.0 - Generate dualing side-by-side histograms (https://github.com/zombocom/mini_histogram/pull/6) ## 0.2.2 - Frozen string optimization in histogram/plot.rb (https://github.com/zombocom/mini_histogram/pull/5) ## 0.2.1 - Added missing constant needed for plotting support (https://github.com/zombocom/mini_histogram/pull/4) ## 0.2.0 - Experimental plotting support added (https://github.com/zombocom/mini_histogram/pull/3) ## 0.1.3 - Handle edge cases (https://github.com/zombocom/mini_histogram/pull/2) ## 0.1.2 - Add `edge` as alias to `edges` ## 0.1.1 - Fix multi histogram weights, with set_average_edges! method (https://github.com/zombocom/mini_histogram/pull/1) ## 0.1.0 - First mini_histogram-0.3.1/mini_histogram.gemspec0000644000004100000410000000262513761065275021153 0ustar www-datawww-datarequire_relative 'lib/mini_histogram/version' Gem::Specification.new do |spec| spec.name = "mini_histogram" spec.version = MiniHistogram::VERSION spec.authors = ["schneems"] spec.email = ["richard.schneeman+foo@gmail.com"] spec.summary = %q{A small gem for building histograms out of Ruby arrays} spec.description = %q{It makes histograms out of Ruby data. How cool is that!? Pretty cool if you ask me.} spec.homepage = "https://github.com/zombocom/mini_histogram" spec.license = "MIT" spec.required_ruby_version = Gem::Requirement.new(">= 2.1.0") spec.metadata["homepage_uri"] = spec.homepage # spec.metadata["source_code_uri"] = "blerg" # spec.metadata["changelog_uri"] = "blerg" # Specify which files should be added to the gem when it is released. # The `git ls-files -z` loads the files in the RubyGem that have been added into git. spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) } end spec.bindir = "exe" spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) } spec.require_paths = ["lib"] spec.add_development_dependency "m" # Used for comparison testing, but only supports Ruby 2.4+ # spec.add_development_dependency "enumerable-statistics" spec.add_development_dependency "benchmark-ips" end mini_histogram-0.3.1/.gitignore0000644000004100000410000000012713761065275016560 0ustar www-datawww-data/.bundle/ /.yardoc /_yardoc/ /coverage/ /doc/ /pkg/ /spec/reports/ /tmp/ Gemfile.lock mini_histogram-0.3.1/CODE_OF_CONDUCT.md0000644000004100000410000000625213761065275017374 0ustar www-datawww-data# Contributor Covenant Code of Conduct ## Our Pledge In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation. ## Our Standards Examples of behavior that contributes to creating a positive environment include: * Using welcoming and inclusive language * Being respectful of differing viewpoints and experiences * Gracefully accepting constructive criticism * Focusing on what is best for the community * Showing empathy towards other community members Examples of unacceptable behavior by participants include: * The use of sexualized language or imagery and unwelcome sexual attention or advances * Trolling, insulting/derogatory comments, and personal or political attacks * Public or private harassment * Publishing others' private information, such as a physical or electronic address, without explicit permission * Other conduct which could reasonably be considered inappropriate in a professional setting ## Our Responsibilities Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior. Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful. ## Scope This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers. ## Enforcement Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at richard.schneeman+foo@gmail.com. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately. Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership. ## Attribution This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [https://contributor-covenant.org/version/1/4][version] [homepage]: https://contributor-covenant.org [version]: https://contributor-covenant.org/version/1/4/ mini_histogram-0.3.1/Rakefile0000644000004100000410000000136613761065275016243 0ustar www-datawww-datarequire "bundler/gem_tasks" require "rake/testtask" $LOAD_PATH.unshift File.expand_path("./lib", __dir__) Rake::TestTask.new(:test) do |t| t.libs << "test" t.libs << "lib" t.test_files = FileList["test/**/*_test.rb"] end task :default => :test task :bench do require 'benchmark/ips' require 'enumerable/statistics' require 'mini_histogram' array = 1000.times.map { rand } histogram = MiniHistogram.new(array) my_weights = histogram.weights puts array.histogram.weights == my_weights puts array.histogram.weights.inspect puts my_weights.inspect Benchmark.ips do |x| x.report("enumerable stats") { array.histogram } x.report("mini histogram ") { MiniHistogram.new(array).weights } x.compare! end end mini_histogram-0.3.1/lib/0000755000004100000410000000000013761065275015336 5ustar www-datawww-datamini_histogram-0.3.1/lib/mini_histogram/0000755000004100000410000000000013761065275020347 5ustar www-datawww-datamini_histogram-0.3.1/lib/mini_histogram/plot.rb0000644000004100000410000005307713761065275021666 0ustar www-datawww-data# frozen_string_literal: true require 'stringio' require_relative '../mini_histogram' # allows people to require 'mini_histogram/plot' directly # Plots the histogram in unicode characters # # Thanks to https://github.com/red-data-tools/unicode_plot.rb # it could not be used because the dependency enumerable-statistics has a hard # lock on a specific version of Ruby and this library needs to support older Rubies # # Example: # # require 'mini_histogram/plot' # array = 50.times.map { rand(11.2..11.6) } # histogram = MiniHistogram.new(array) # puts histogram.plot => Generates a plot # class MiniHistogram # This is an object that holds a histogram # and it's corresponding plot options # # Example: # # x = PlotValue.new # x.values = [1,2,3,4,5] # x.options = {xlabel: "random"} # # x.plot # => Generates a histogram plot with these values and options class PlotValue attr_accessor :histogram, :options def initialize @histogram = nil @options = {} end def plot raise "@histogram cannot be empty set via `values=` or `histogram=` methods" if @histogram.nil? @histogram.plot(**@options) end def values=(values) @histogram = MiniHistogram.new(values) end def self.dual_plot(plot_a, plot_b) a_lines = plot_a.to_s.lines b_lines = plot_b.to_s.lines max_length = a_lines.map(&:length).max side_by_side = String.new("") a_lines.each_index do |i| side_by_side << a_lines[i].chomp.ljust(max_length) # Remove newline, ensure same length side_by_side << b_lines[i] end return side_by_side end end private_constant :PlotValue def self.dual_plot a = PlotValue.new b = PlotValue.new yield a, b if b.options[:ylabel] == a.options[:ylabel] b.options[:ylabel] = nil end MiniHistogram.set_average_edges!(a.histogram, b.histogram) PlotValue.dual_plot(a.plot, b.plot) end def plot( nbins: nil, closed: :left, symbol: "▇", **kw) hist = self.histogram(*[nbins].compact, closed: closed) edge, counts = hist.edge, hist.weights labels = [] bin_width = edge[1] - edge[0] pad_left, pad_right = 0, 0 (0 ... edge.length).each do |i| val1 = float_round_log10(edge[i], bin_width) val2 = float_round_log10(val1 + bin_width, bin_width) a1 = val1.to_s.split('.', 2).map(&:length) a2 = val2.to_s.split('.', 2).map(&:length) pad_left = [pad_left, a1[0], a2[0]].max pad_right = [pad_right, a1[1], a2[1]].max end l_str = hist.closed == :right ? "(" : "[" r_str = hist.closed == :right ? "]" : ")" counts.each_with_index do |n, i| val1 = float_round_log10(edge[i], bin_width) val2 = float_round_log10(val1 + bin_width, bin_width) a1 = val1.to_s.split('.', 2).map(&:length) a2 = val2.to_s.split('.', 2).map(&:length) labels[i] = "\e[90m#{l_str}\e[0m" + (" " * (pad_left - a1[0])) + val1.to_s + (" " * (pad_right - a1[1])) + "\e[90m, \e[0m" + (" " * (pad_left - a2[0])) + val2.to_s + (" " * (pad_right - a2[1])) + "\e[90m#{r_str}\e[0m" end xscale = kw.delete(:xscale) xlabel = kw.delete(:xlabel) || MiniUnicodePlot::ValueTransformer.transform_name(xscale, "Frequency") barplot(labels, counts, symbol: symbol, xscale: xscale, xlabel: xlabel, **kw) end ## Begin copy/pasta from unicode_plot.rb with some slight modifications private def barplot( *args, width: 40, color: :green, symbol: "■", border: :barplot, xscale: nil, xlabel: nil, data: nil, **kw) case args.length when 0 data = Hash(data) keys = data.keys.map(&:to_s) heights = data.values when 2 keys = Array(args[0]) heights = Array(args[1]) else raise ArgumentError, "invalid arguments" end unless keys.length == heights.length raise ArgumentError, "The given vectors must be of the same length" end unless heights.min >= 0 raise ArgumentError, "All values have to be positive. Negative bars are not supported." end xlabel ||= ValueTransformer.transform_name(xscale) plot = MiniUnicodePlot::Barplot.new(heights, width, color, symbol, xscale, border: border, xlabel: xlabel, **kw) keys.each_with_index do |key, i| plot.annotate_row!(:l, i, key) end plot end private def float_round_log10(x, m) if x == 0 0.0 elsif x > 0 x.round(ceil_neg_log10(m) + 1).to_f else -(-x).round(ceil_neg_log10(m) + 1).to_f end end private def ceil_neg_log10(x) if roundable?(-Math.log10(x)) (-Math.log10(x)).ceil else (-Math.log10(x)).floor end end INT64_MIN = -9223372036854775808 INT64_MAX = 9223372036854775807 private def roundable?(x) x.to_i == x && INT64_MIN <= x && x < INT64_MAX end module MiniUnicodePlot module ValueTransformer PREDEFINED_TRANSFORM_FUNCTIONS = { log: Math.method(:log), ln: Math.method(:log), log10: Math.method(:log10), lg: Math.method(:log10), log2: Math.method(:log2), lb: Math.method(:log2), }.freeze def transform_values(func, values) return values unless func unless func.respond_to?(:call) func = PREDEFINED_TRANSFORM_FUNCTIONS[func] unless func.respond_to?(:call) raise ArgumentError, "func must be callable" end end case values when Numeric func.(values) else values.map(&func) end end module_function def transform_name(func, basename="") return basename unless func case func when String, Symbol name = func when ->(f) { f.respond_to?(:name) } name = func.name else name = "custom" end "#{basename} [#{name}]" end end module BorderMaps BORDER_SOLID = { tl: "┌", tr: "┐", bl: "└", br: "┘", t: "─", l: "│", b: "─", r: "│" }.freeze BORDER_CORNERS = { tl: "┌", tr: "┐", bl: "└", br: "┘", t: " ", l: " ", b: " ", r: " ", }.freeze BORDER_BARPLOT = { tl: "┌", tr: "┐", bl: "└", br: "┘", t: " ", l: "┤", b: " ", r: " ", }.freeze end BORDER_MAP = { solid: BorderMaps::BORDER_SOLID, corners: BorderMaps::BORDER_CORNERS, barplot: BorderMaps::BORDER_BARPLOT, }.freeze module StyledPrinter TEXT_COLORS = { black: "\033[30m", red: "\033[31m", green: "\033[32m", yellow: "\033[33m", blue: "\033[34m", magenta: "\033[35m", cyan: "\033[36m", white: "\033[37m", gray: "\033[90m", light_black: "\033[90m", light_red: "\033[91m", light_green: "\033[92m", light_yellow: "\033[93m", light_blue: "\033[94m", light_magenta: "\033[95m", light_cyan: "\033[96m", normal: "\033[0m", default: "\033[39m", bold: "\033[1m", underline: "\033[4m", blink: "\033[5m", reverse: "\033[7m", hidden: "\033[8m", nothing: "", } 0.upto(255) do |i| TEXT_COLORS[i] = "\033[38;5;#{i}m" end TEXT_COLORS.freeze DISABLE_TEXT_STYLE = { bold: "\033[22m", underline: "\033[24m", blink: "\033[25m", reverse: "\033[27m", hidden: "\033[28m", normal: "", default: "", nothing: "", }.freeze COLOR_ENCODE = { normal: 0b000, blue: 0b001, red: 0b010, magenta: 0b011, green: 0b100, cyan: 0b101, yellow: 0b110, white: 0b111 }.freeze COLOR_DECODE = COLOR_ENCODE.map {|k, v| [v, k] }.to_h.freeze def print_styled(out, *args, bold: false, color: :normal) return out.print(*args) unless color?(out) str = StringIO.open {|sio| sio.print(*args); sio.close; sio.string } color = :nothing if bold && color == :bold enable_ansi = TEXT_COLORS.fetch(color, TEXT_COLORS[:default]) + (bold ? TEXT_COLORS[:bold] : "") disable_ansi = (bold ? DISABLE_TEXT_STYLE[:bold] : "") + DISABLE_TEXT_STYLE.fetch(color, TEXT_COLORS[:default]) first = true StringIO.open do |sio| str.each_line do |line| sio.puts unless first first = false continue if line.empty? sio.print(enable_ansi, line, disable_ansi) end sio.close out.print(sio.string) end end def print_color(out, color, *args) color = COLOR_DECODE[color] print_styled(out, *args, color: color) end def color?(out) (out && out.tty?) || false end end module BorderPrinter include StyledPrinter def print_border_top(out, padding, length, border=:solid, color: :light_black) return if border == :none b = BORDER_MAP[border] print_styled(out, padding, b[:tl], b[:t] * length, b[:tr], color: color) end def print_border_bottom(out, padding, length, border=:solid, color: :light_black) return if border == :none b = BORDER_MAP[border] print_styled(out, padding, b[:bl], b[:b] * length, b[:br], color: color) end end class Renderer include BorderPrinter def self.render(out, plot) new(plot).render(out) end def initialize(plot) @plot = plot @out = nil end attr_reader :plot attr_reader :out def render(out) @out = out init_render render_top render_rows render_bottom end private def render_top # plot the title and the top border print_title(@border_padding, plot.title, p_width: @border_length, color: :bold) puts if plot.title_given? if plot.show_labels? topleft_str = plot.decorations.fetch(:tl, "") topleft_col = plot.colors_deco.fetch(:tl, :light_black) topmid_str = plot.decorations.fetch(:t, "") topmid_col = plot.colors_deco.fetch(:t, :light_black) topright_str = plot.decorations.fetch(:tr, "") topright_col = plot.colors_deco.fetch(:tr, :light_black) if topleft_str != "" || topright_str != "" || topmid_str != "" topleft_len = topleft_str.length topmid_len = topmid_str.length topright_len = topright_str.length print_styled(out, @border_padding, topleft_str, color: topleft_col) cnt = (@border_length / 2.0 - topmid_len / 2.0 - topleft_len).round pad = cnt > 0 ? " " * cnt : "" print_styled(out, pad, topmid_str, color: topmid_col) cnt = @border_length - topright_len - topleft_len - topmid_len + 2 - cnt pad = cnt > 0 ? " " * cnt : "" print_styled(out, pad, topright_str, "\n", color: topright_col) end end print_border_top(out, @border_padding, @border_length, plot.border) print(" " * @max_len_r, @plot_padding, "\n") end # render all rows def render_rows (0 ... plot.n_rows).each {|row| render_row(row) } end def render_row(row) # Current labels to left and right of the row and their length left_str = plot.labels_left.fetch(row, "") left_col = plot.colors_left.fetch(row, :light_black) right_str = plot.labels_right.fetch(row, "") right_col = plot.colors_right.fetch(row, :light_black) left_len = nocolor_string(left_str).length right_len = nocolor_string(right_str).length unless color?(out) left_str = nocolor_string(left_str) right_str = nocolor_string(right_str) end # print left annotations print(" " * plot.margin) if plot.show_labels? if row == @y_lab_row # print ylabel print_styled(out, plot.ylabel, color: :normal) print(" " * (@max_len_l - plot.ylabel_length - left_len)) else # print padding to fill ylabel length print(" " * (@max_len_l - left_len)) end # print the left annotation print_styled(out, left_str, color: left_col) end # print left border print_styled(out, @plot_padding, @b[:l], color: :light_black) # print canvas row plot.print_row(out, row) #print right label and padding print_styled(out, @b[:r], color: :light_black) if plot.show_labels? print(@plot_padding) print_styled(out, right_str, color: right_col) print(" " * (@max_len_r - right_len)) end puts end def render_bottom # draw bottom border and bottom labels print_border_bottom(out, @border_padding, @border_length, plot.border) print(" " * @max_len_r, @plot_padding) if plot.show_labels? botleft_str = plot.decorations.fetch(:bl, "") botleft_col = plot.colors_deco.fetch(:bl, :light_black) botmid_str = plot.decorations.fetch(:b, "") botmid_col = plot.colors_deco.fetch(:b, :light_black) botright_str = plot.decorations.fetch(:br, "") botright_col = plot.colors_deco.fetch(:br, :light_black) if botleft_str != "" || botright_str != "" || botmid_str != "" puts botleft_len = botleft_str.length botmid_len = botmid_str.length botright_len = botright_str.length print_styled(out, @border_padding, botleft_str, color: botleft_col) cnt = (@border_length / 2.0 - botmid_len / 2.0 - botleft_len).round pad = cnt > 0 ? " " * cnt : "" print_styled(out, pad, botmid_str, color: botmid_col) cnt = @border_length - botright_len - botleft_len - botmid_len + 2 - cnt pad = cnt > 0 ? " " * cnt : "" print_styled(out, pad, botright_str, color: botright_col) end # abuse the print_title function to print the xlabel. maybe refactor this puts if plot.xlabel_given? print_title(@border_padding, plot.xlabel, p_width: @border_length) end end def init_render @b = BORDER_MAP[plot.border] @border_length = plot.n_columns # get length of largest strings to the left and right @max_len_l = plot.show_labels? && !plot.labels_left.empty? ? plot.labels_left.each_value.map {|l| nocolor_string(l).length }.max : 0 @max_len_r = plot.show_labels? && !plot.labels_right.empty? ? plot.labels_right.each_value.map {|l| nocolor_string(l).length }.max : 0 if plot.show_labels? && plot.ylabel_given? @max_len_l += plot.ylabel_length + 1 end # offset where the plot (incl border) begins @plot_offset = @max_len_l + plot.margin + plot.padding # padding-string from left to border @plot_padding = " " * plot.padding # padding-string between labels and border @border_padding = " " * @plot_offset # compute position of ylabel @y_lab_row = (plot.n_rows / 2.0).round - 1 end def print_title(padding, title, p_width: 0, color: :normal) return unless title && title != "" offset = (p_width / 2.0 - title.length / 2.0).round offset = [offset, 0].max tpad = " " * offset print_styled(out, padding, tpad, title, color: color) end def print(*args) out.print(*args) end def puts(*args) out.puts(*args) end def nocolor_string(str) str.to_s.gsub(/\e\[[0-9]+m/, "") end end class Plot include StyledPrinter DEFAULT_WIDTH = 40 DEFAULT_BORDER = :solid DEFAULT_MARGIN = 3 DEFAULT_PADDING = 1 def initialize(title: nil, xlabel: nil, ylabel: nil, border: DEFAULT_BORDER, margin: DEFAULT_MARGIN, padding: DEFAULT_PADDING, labels: true) @title = title @xlabel = xlabel @ylabel = ylabel @border = border @margin = check_margin(margin) @padding = padding @labels_left = {} @colors_left = {} @labels_right = {} @colors_right = {} @decorations = {} @colors_deco = {} @show_labels = labels @auto_color = 0 end attr_reader :title attr_reader :xlabel attr_reader :ylabel attr_reader :border attr_reader :margin attr_reader :padding attr_reader :labels_left attr_reader :colors_left attr_reader :labels_right attr_reader :colors_right attr_reader :decorations attr_reader :colors_deco def title_given? title && title != "" end def xlabel_given? xlabel && xlabel != "" end def ylabel_given? ylabel && ylabel != "" end def ylabel_length (ylabel && ylabel.length) || 0 end def show_labels? @show_labels end def annotate!(loc, value, color: :normal) case loc when :l (0 ... n_rows).each do |row| if @labels_left.fetch(row, "") == "" @labels_left[row] = value @colors_left[row] = color break end end when :r (0 ... n_rows).each do |row| if @labels_right.fetch(row, "") == "" @labels_right[row] = value @colors_right[row] = color break end end when :t, :b, :tl, :tr, :bl, :br @decorations[loc] = value @colors_deco[loc] = color else raise ArgumentError, "unknown location to annotate (#{loc.inspect} for :t, :b, :l, :r, :tl, :tr, :bl, or :br)" end end def annotate_row!(loc, row_index, value, color: :normal) case loc when :l @labels_left[row_index] = value @colors_left[row_index] = color when :r @labels_right[row_index] = value @colors_right[row_index] = color else raise ArgumentError, "unknown location `#{loc}`, try :l or :r instead" end end def render(out) Renderer.render(out, self) end COLOR_CYCLE = [ :green, :blue, :red, :magenta, :yellow, :cyan ].freeze def next_color COLOR_CYCLE[@auto_color] ensure @auto_color = (@auto_color + 1) % COLOR_CYCLE.length end def to_s StringIO.open do |sio| render(sio) sio.close sio.string end end private def check_margin(margin) if margin < 0 raise ArgumentError, "margin must be >= 0" end margin end private def check_row_index(row_index) unless 0 <= row_index && row_index < n_rows raise ArgumentError, "row_index out of bounds" end end end class Barplot < Plot include ValueTransformer MIN_WIDTH = 10 DEFAULT_COLOR = :green DEFAULT_SYMBOL = "■" def initialize(bars, width, color, symbol, transform, **kw) if symbol.length > 1 raise ArgumentError, "symbol must be a single character" end @bars = bars @symbol = symbol @max_freq, i = find_max(transform_values(transform, bars)) @max_len = bars[i].to_s.length @width = [width, max_len + 7, MIN_WIDTH].max @color = color @symbol = symbol @transform = transform super(**kw) end attr_reader :max_freq attr_reader :max_len attr_reader :width def n_rows @bars.length end def n_columns @width end def add_row!(bars) @bars.concat(bars) @max_freq, i = find_max(transform_values(@transform, bars)) @max_len = @bars[i].to_s.length end def print_row(out, row_index) check_row_index(row_index) bar = @bars[row_index] max_bar_width = [width - 2 - max_len, 1].max val = transform_values(@transform, bar) bar_len = max_freq > 0 ? ([val, 0].max.fdiv(max_freq) * max_bar_width).round : 0 bar_str = max_freq > 0 ? @symbol * bar_len : "" bar_lbl = bar.to_s print_styled(out, bar_str, color: @color) print_styled(out, " ", bar_lbl, color: :normal) pan_len = [max_bar_width + 1 + max_len - bar_len - bar_lbl.length, 0].max pad = " " * pan_len.round out.print(pad) end private def find_max(values) i = j = 0 max = values[i] while j < values.length if values[j] > max i, max = j, values[j] end j += 1 end [max, i] end end end private_constant :MiniUnicodePlot end mini_histogram-0.3.1/lib/mini_histogram/version.rb0000644000004100000410000000005413761065275022360 0ustar www-datawww-dataclass MiniHistogram VERSION = "0.3.1" end mini_histogram-0.3.1/lib/mini_histogram.rb0000644000004100000410000001317313761065275020701 0ustar www-datawww-datarequire "mini_histogram/version" # A class for building histogram info # # Given an array, this class calculates the "edges" of a histogram # these edges mark the boundries for "bins" # # array = [1,1,1, 5, 5, 5, 5, 10, 10, 10] # histogram = MiniHistogram.new(array) # puts histogram.edges # # => [0.0, 2.0, 4.0, 6.0, 8.0, 10.0, 12.0] # # It also finds the weights (aka count of values) that would go in each bin: # # puts histogram.weights # # => [3, 0, 4, 0, 0, 3] # # This means that the `array` here had three items between 0.0 and 2.0. # class MiniHistogram class Error < StandardError; end attr_reader :array, :left_p, :max def initialize(array, left_p: true, edges: nil) @array = array @left_p = left_p @edges = edges @weights = nil @min, @max = array.minmax end def edges_min edges.min end def edges_max edges.max end def histogram(*_) self end def closed @left_p ? :left : :right end # Sets the edge value to something new, # also clears any previously calculated values def update_values(edges:, max: ) @edges = edges @max = max @weights = nil # clear memoized value end def bin_size return 0 if edges.length <= 1 edges[1] - edges[0] end # Weird name, right? There are multiple ways to # calculate the number of "bins" a histogram should have, one # of the most common is the "sturges" method # # Here are some alternatives from numpy: # https://github.com/numpy/numpy/blob/d9b1e32cb8ef90d6b4a47853241db2a28146a57d/numpy/lib/histograms.py#L489-L521 def sturges len = array.length return 1.0 if len == 0 # return (long)(ceil(Math.log2(n)) + 1); return Math.log2(len).ceil + 1 end # Given an array of edges and an array we want to generate a histogram from # return the counts for each "bin" # # Example: # # a = [1,1,1, 5, 5, 5, 5, 10, 10, 10] # edges = [0.0, 2.0, 4.0, 6.0, 8.0, 10.0, 12.0] # # MiniHistogram.new(a).weights # # => [3, 0, 4, 0, 0, 3] # # This means that the `a` array has 3 values between 0.0 and 2.0 # 4 values between 4.0 and 6.0 and three values between 10.0 and 12.0 def weights return @weights if @weights return @weights = [] if array.empty? lo = edges.first step = edges[1] - edges[0] max_index = ((@max - lo) / step).floor @weights = Array.new(max_index + 1, 0) array.each do |x| index = ((x - lo) / step).floor @weights[index] += 1 end return @weights end # Finds the "edges" of a given histogram that will mark the boundries # for the histogram's "bins" # # Example: # # a = [1,1,1, 5, 5, 5, 5, 10, 10, 10] # MiniHistogram.new(a).edges # # => [0.0, 2.0, 4.0, 6.0, 8.0, 10.0, 12.0] # # There are multiple ways to find edges, this was taken from # https://github.com/mrkn/enumerable-statistics/issues/24 # # Another good set of implementations is in numpy # https://github.com/numpy/numpy/blob/d9b1e32cb8ef90d6b4a47853241db2a28146a57d/numpy/lib/histograms.py#L222 def edges return @edges if @edges return @edges = [0.0] if array.empty? lo = @min hi = @max nbins = sturges.to_f if hi == lo start = lo step = 1.0 divisor = 1.0 len = 1 else bw = (hi - lo) / nbins lbw = Math.log10(bw) if lbw >= 0 step = 10 ** lbw.floor * 1.0 r = bw/step if r <= 1.1 # do nothing elsif r <= 2.2 step *= 2.0 elsif r <= 5.5 step *= 5.0 else step *= 10 end divisor = 1.0 start = step * (lo/step).floor len = ((hi - start)/step).ceil else divisor = 10 ** - lbw.floor r = bw * divisor if r <= 1.1 # do nothing elsif r <= 2.2 divisor /= 2.0 elsif r <= 5.5 divisor /= 5.0 else divisor /= 10.0 end step = 1.0 start = (lo * divisor).floor len = (hi * divisor - start).ceil end end if left_p while (lo < start/divisor) start -= step end while (start + (len - 1)*step)/divisor <= hi len += 1 end else while lo <= start/divisor start -= step end while (start + (len - 1)*step)/divisor < hi len += 1 end end @edges = [] len.times.each do @edges << start/divisor start += step end return @edges end alias :edge :edges def plot raise "You must `require 'mini_histogram/plot'` to get this feature" end # Given an array of Histograms this function calcualtes # an average edge size along with the minimum and maximum # edge values. It then updates the edge value on all inputs # # The main pourpose of this method is to be able to chart multiple # distributions against a similar axis # # See for more context: https://github.com/schneems/derailed_benchmarks/pull/169 def self.set_average_edges!(*array_of_histograms) array_of_histograms.each { |x| raise "Input expected to be a histogram but is #{x.inspect}" unless x.is_a?(MiniHistogram) } steps = array_of_histograms.map(&:bin_size) avg_step_size = steps.inject(&:+).to_f / steps.length max_value = array_of_histograms.map(&:max).max max_edge = array_of_histograms.map(&:edges_max).max min_edge = array_of_histograms.map(&:edges_min).min average_edges = [min_edge] while average_edges.last < max_edge average_edges << average_edges.last + avg_step_size end array_of_histograms.each {|h| h.update_values(edges: average_edges, max: max_value) } return array_of_histograms end end mini_histogram-0.3.1/Gemfile0000644000004100000410000000022313761065275016060 0ustar www-datawww-datasource "https://rubygems.org" # Specify your gem's dependencies in mini_histogram.gemspec gemspec gem "rake", "~> 12.0" gem "minitest", "~> 5.0" mini_histogram-0.3.1/.github/0000755000004100000410000000000013761065275016130 5ustar www-datawww-datamini_histogram-0.3.1/.github/workflows/0000755000004100000410000000000013761065275020165 5ustar www-datawww-datamini_histogram-0.3.1/.github/workflows/check_changelog.yml0000644000004100000410000000056013761065275023775 0ustar www-datawww-dataname: Check Changelog on: [pull_request] jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v1 - name: Check that CHANGELOG is touched run: | cat $GITHUB_EVENT_PATH | jq .pull_request.title | grep -i '\[\(\(changelog skip\)\|\(ci skip\)\)\]' || git diff remotes/origin/${{ github.base_ref }} --name-only | grep CHANGELOG.md mini_histogram-0.3.1/LICENSE.txt0000644000004100000410000000206313761065275016414 0ustar www-datawww-dataThe MIT License (MIT) Copyright (c) 2020 schneems Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.