skimmer

package module
v0.0.23 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 17, 2025 License: AGPL-3.0 Imports: 21 Imported by: 0

README

Skimmer Project

The Skimmer Project is a set of tools for working with feeds. It currently drives the Antenna project.

Skimmer originated as a simple terminal based feed reader.

skimmer is a lightweight feed reader inspired by newsboat. skimmer is very minimal and deliberately lacks features. It has less features than newsboat. I think skimmer's best feature is what it doesn't do. skimmer tries to do two things well.

  1. Read a list of URLs, fetch the feeds and write the items to an SQLite 3 database
  2. Display the items from the SQLite 3 database in reverse chronological order

That's it. That is skimmer secret power. It does only two things. There is no elaborate user interface beyond standard input, standard output and standard error found on POSIX type operating systems.

If you invoke Skimmer's "interactive" mode your choices are still very limited.

  • press enter and go to next item
  • press "n" and mark the item read
  • press "s" and mark the item "save" the item
  • press "q" and quit interactive mode.

By storing the item information in an SQLite3 database (like newsboat's cache.db file) I can re-purpose the feed content as needed. An example is my Antenna experiment. It is a personal news aggregation website. Another might be to convert the entries to BibTeX and manage them as reference. Lots of options are possible. The key here is the SQLite3 database file.

included in the Go based part of the project are a few additional tools that helped in creating Antenna. Longer run I am thinking about changing horses to Deno compiled TypeScript to take advantage of the work that Dave Winer has done, see https://github.com/scripting. increasing I see Skimmer evolving from feed reading and link blogging tool to something more generalized like serving as a post creation tool allowing for publishing static content in Markdown through RSS feeds delivering Markdown content to straight up blogging and micro blogging.

skimmer's url list

As mentioned skimmer was very much inspired by newsboat. In fact it uses and enhanced version of newsboat's urls list format. That's because skimmer isn't trying to replace newsboat as a reader of all feeds but instead gives me more options for what I can do with the feeds I've collected.

The newsboat urls file boils down to a list of urls, one per line with an optional "label" added after the url. That "label" is expressed as a double quote, tilde, label content followed by a double quote. One feed per line. That's really easy to parse. You can add comments using the hash mark with hash mark and anything to the right ignored when the urls are read in to skimmer. Skimmer adds a third item on an feed's line. After the label you can include the agent string you want to use when interacting with the feed's host. That capability was added in October 2023.

UPDATE: 2023-10-31, In using the experimental skimmer app in practice I have found some feed sources still white list access based on user agent strings. Unfortunately it is highly inconsistently to know which string is accepted. As a result maintaining a list of feeds is really challenging unless you can specific a user agent string per feed source for those that need it. As a result I've add an additional column of content to the newsboat url file format. A user agent can be included after a feed's label by adding a space and the user agent string value.

UPDATE: 2025-02-14, I've been relying on skimmer to browse my RSS feeds collections for a couple years now. By and large it works. Since I started Skimmer I've noticed how the new crop of social media platforms have also included RSS support. You can follow people using RSS feeds on Mastodon, BlueSky, and Threads. While we're also experiencing an AI driven bot meltdown on the web my hope is that RSS feed practices will continue.

Skimmer's SQLite 3 database

Skimmer uses SQLite 3 databases to hold collections of feeds and their items. It doesn't use newsboat's cache.db but is very much inspired by it. The name of a Skimmer database ends in ".skim" and pairs with the name of the urls in a Skimmer url LIST. file. Example if I have a urls list named "my_news.txt" the skimmer program will use a database file (and create it if it doesn't exist) called "my_news.skim". Each time skimmer reads the urls file it will replace the content in the skimmer database file except for any notations about a given item having been read or saved.

skimmer feed types

Presently skimmer is focused on reading RSS 2, Atom and jsonfeeds as that is provided by the Go package skimmer uses (i.e. goread). Someday, maybe, I hope to include support for Gopher or Gemini feeds.

SYNOPSIS

skimmer [OPTIONS] URL_LIST_FILENAME
skimmer [OPTIONS] SKIMMER_DB_FILENAME [TIME_RANGE]

skimmer have two ways to invoke it. You can fetch the contents from list of URLs in newsboat urls file format. You can read the items from the related skimmer database.

OPTIONS

-help : display a help page

-license : display license

-version : display version number and build hash

-limit N : Limit the display the N most recent items

-prune : The deletes items from the items table for the skimmer file provided. If a time range is provided then the items in the time range will be deleted. If a single time is provided everything older than that time is deleted. A time can be specified in several ways. An alias of "today" would remove all items older than today. If "now" is specified then all items older then the current time would be removed. Otherwise time can be specified as a date in YYYY-MM-DD format or timestamp YYYY-MM-DD HH:MM:SS format.

-i, -interactive : display an item and prompt for next action. e.g. (n)ext, (s)ave, (t)ag, (q)uit. If you press enter the next item will be displayed without marking changing the items state (e.g. marking it read). If you press "n" the item will be marked as read before displaying the next item. If you press "s" the item will be tagged as saved and next item will be displayed. If you press "t" you can tag the items. Tagged items are treated as save but the next item is not fetched. Pressing "q" will quit interactive mode without changing the last items state.

Examples

Fetch and read my newsboat feeds from .newsboat/urls. This will create a .newsboat/urls.skim if it doesn't exist. Remember invoking skimmer with a URLs file will retrieve feeds and their contents and invoking skimmer with the skimmer database file will let you read them.

skimmer .newsboat/urls
skimmer .newsboat/urls.skim

This will fetch and read the feeds frommy-news.urls. This will create a my-news.skim file. When the skimmer database is read a simplistic interactive mode is presented.

skimmer my-news.urls
skimmer -i my-news.skim

The same method is used to update your my-news.skim file and read it.

Export the current state of the skimmer database channels to a urls file. Feeds that failed to be retrieved will not be in the database channels table channels table. This is an easy way to get rid of the cruft and dead feeds.

skimmer -urls my-news.skim >my-news.urls

Prune the items in the database older than today.

skimmer -prune my-news.skim today

Prune the items older than September 30, 2023.

skimmer -prune my-news.skim \
    "2023-09-30 23:59:59"

Installation instructions

Installation From Source

Requirements

skimmer is an experiment. The compiled binaries are not necessarily tested. To compile from source you need to have git, make, Pandoc, SQLite3 and Go.

  • Git >= 2
  • Make >= 3.8 (GNU Make)
  • Pandoc > 3
  • SQLite3 > 3.4
  • Go >= 1.21.4
Steps to compile and install

Installation process I used to setup skimmer on a new machine.

git clone https://github.com/rsdoiel/skimmer
cd skimmer
make
make install

Acknowledgments

This experiment would not be possible with the authors of newsboat, SQLite3, Pandoc and the gofeed package for Go.

Documentation

Index

Constants

View Source
const (
	EnvHttpBrowser   = "SKIM_HTTP_BROWSER"
	EnvGopherBrowser = "SKIM_GOPHER_BROWSER"
	EnvGemeniBrowser = "SKIM_GEMINI_BROWSER"
	EnvFtpBrowser    = "SKIM_FTP_BROWSER"
)
View Source
const (
	// Version number of release
	Version = "0.0.23"

	// ReleaseDate, the date version.go was generated
	ReleaseDate = "2025-08-17"

	// ReleaseHash, the Git hash when version.go was generated
	ReleaseHash = "28b79f3"
	LicenseText = `` /* 34525-byte string literal not displayed */

)

Variables

View Source
var (
	// SQLCreateTables provides the statements that are use to create our tables
	// It has two percent s, first is feed list name, second is datetime scheme
	// was generated.
	SQLCreateTables = `` /* 651-byte string literal not displayed */

	// SQLResetChannels clear the channels talbe
	SQLResetChannels = `DELETE FROM channels;`

	// Update the channels in the skimmer file
	SQLUpdateChannel = `` /* 221-byte string literal not displayed */

	// Update a feed item in the items table
	SQLUpdateItem = `` /* 302-byte string literal not displayed */

	// Return link and title for Urls formatted output
	SQLChannelsAsUrls = `SELECT link, title FROM channels ORDER BY link;`

	// SQLItemCount returns a list of items in the items table
	SQLItemCount = `SELECT COUNT(*) FROM items;`

	// SQLItemStats returns a list of rows with totals per status
	SQLItemStats = `SELECT IIF(status = '', 'unread', status) AS status, COUNT(*) FROM items GROUP BY status ORDER BY status`

	// SQLDisplayItems returns a list of items in decending chronological order.
	SQLDisplayItems = `` /* 183-byte string literal not displayed */

	SQLMarkItem = `UPDATE items SET status = ? WHERE link = ?;`

	SQLTagItem = `UPDATE items SET tags = ? WHERE link = ?;`

	// SQLPruneItems will prune our items table for all items that have easier
	// a updated or publication date early than the timestamp provided.
	SQLPruneItems = `` /* 167-byte string literal not displayed */

)

Functions

func CheckWaitInterval

func CheckWaitInterval(iTime time.Time, wait time.Duration) (time.Time, bool)

CheckWaitInterval checks to see if an interval of time has been met or exceeded. It returns the remaining time interval (possibly reset) and a boolean. The boolean is true when the time interval has been met or exceeded, false otherwise.

``` tot := len(something) // calculate the total number of items to process t0 := time.Now() iTime := time.Now() reportProgress := false

for i, key := range records {
    // ... process stuff ...
    if iTime, reportProgress = CheckWaitInterval(rptTime, (30 * time.Second)); reportProgress {
        log.Printf("%s", ProgressETA(t0, i, tot))
    }
}

```

func ClearScreen added in v0.0.3

func ClearScreen()

func FmtHelp

func FmtHelp(src string, appName string, version string, releaseDate string, releaseHash string) string

FmtHelp lets you process a text block with simple curly brace markup.

func JSONMarshal added in v0.0.3

func JSONMarshal(data interface{}) ([]byte, error)

JSONMarshal provides provide a custom json encoder to solve a an issue with HTML entities getting converted to UTF-8 code points by json.Marshal(), json.MarshalIndent().

func JSONMarshalIndent added in v0.0.3

func JSONMarshalIndent(data interface{}, prefix string, indent string) ([]byte, error)

JSONMarshalIndent provides provide a custom json encoder to solve a an issue with HTML entities getting converted to UTF-8 code points by json.Marshal(), json.MarshalIndent().

func JSONUnmarshal added in v0.0.3

func JSONUnmarshal(src []byte, data interface{}) error

JSONUnmarshal is a custom JSON decoder so we can treat numbers easier

func OpenInBrowser added in v0.0.5

func OpenInBrowser(in io.Reader, out io.Writer, eout io.Writer, link string) error

func ParseURLList

func ParseURLList(fName string, src []byte) (map[string]*FeedSource, error)

ParseURLList takes a filename and byte slice source, parses the contents returning a map of urls to labels and an error value.

func ProgressETA

func ProgressETA(t0 time.Time, i int, tot int) string

ProgressETA returns a string with the percentage processed and estimated time remaining. It requires the a counter of records processed, the total count of records and a time zero value.

``` tot := len(something) // calculate the total number of items to process t0 := time.Now() iTime := time.Now() reportProgress := false

for i, key := range records {
    // ... process stuff ...
    if iTime, reportProgress = CheckWaitInterval(rptTime, (30 * time.Second)); reportProgress {
        log.Printf("%s", ProgressETA(t0, i, tot))
    }
}

```

func ProgressIPS

func ProgressIPS(t0 time.Time, i int, timeUnit time.Duration) string

ProgressIPS returns a string with the elapsed time and increments per second. Takes a time zero, a counter and time unit. Returns a string with count, running time and increments per time unit. ``` t0 := time.Now() iTime := time.Now() reportProgress := false

for i, key := range records {
    // ... process stuff ...
    if iTime, reportProgress = CheckWaitInterval(rptTime, (30 * time.Second)); reportProgress || i = 0 {
        log.Printf("%s", ProgressIPS(t0, i, time.Second))
    }
}

```

func SaveChannel added in v0.0.8

func SaveChannel(db *sql.DB, link string, feedLabel string, channel *gofeed.Feed) error

SaveChannel will write the Channel information to a skimmer channel table.

func SaveItem added in v0.0.8

func SaveItem(db *sql.DB, feedLabel string, item *gofeed.Item) error

SaveItem saves a gofeed item to the item table in the skimmer database

func SetupScreen added in v0.0.3

func SetupScreen(out io.Writer)

Types

type FeedSource added in v0.0.8

type FeedSource struct {
	Url       string `json:"url,omitempty"`
	Label     string `json:"label,omitempty"`
	UserAgent string `json:"user_agent,omitempty"`
}

FeedSource describes the source of a feed. It includes the URL, an optional label, user agent string.

type Html2Skim added in v0.0.8

type Html2Skim struct {
	// AppName holds the name of the application
	AppName string `json:"app_name,omitempty"`

	// DbName holds the path to the SQLite3 database
	DBName string `json:"db_name,omitempty"`

	// URL holds the URL to visit to collect items from
	URL string `json:"url,omitempty"`

	// Selector holds the HTML selector to used to retrieve links
	// an empty page will result looking for all href in the page document
	Selector string `json:"selector,omitempty"`

	// Title holds channel title for the psuedo feed created by scraping
	Title string `json:"title,omitempty"`

	// Description holds the channel description for the pseudo feed created by scraping
	Description string `json:"description,omitempty"`

	// Link set the feed link for channel, this is useful if you render a pseudo feed to RSS
	Link string `json:"link,omitempty"`

	// Generator lets you set the generator value for the channel
	Generator string `json:"generator,omitempty"`

	// LastBuildDate sets the date for the channel being built
	LastBuildDate string `json:"last_build_date,omitempty"`
	// contains filtered or unexported fields
}

Htm2Skim uses the Coly Golang package to scrape a website and turn it into an RSS feed.

Html2Skim struct holds the configuration for scraping a webpage and and updating a skimmer database populating both the channel table and items table based on how the struct is set.

func NewHtml2Skim added in v0.0.8

func NewHtml2Skim(appName string) (*Html2Skim, error)

NewHtml2Skim initialized a new Html2Skim struct

func (*Html2Skim) Run added in v0.0.8

func (app *Html2Skim) Run(out io.Writer, eout io.Writer, args []string, title string, description string, link string) error

func (*Html2Skim) Scrape added in v0.0.8

func (app *Html2Skim) Scrape(db *sql.DB, uri string, selector string) (*gofeed.Feed, error)

Scrape takes a Skimmer database, a URI (url) and CSS selector pointing at anchor elements you want to create a feed with. It then collects those links and renders a feed struct and error value.

type Skim2Html added in v0.0.22

type Skim2Html struct {
	// AppName holds the name of the application
	// used when generating the "generator" metadata
	AppName string `json:"app_name,omitempty" yaml:"app_name,omitempty"`

	// Version holds the version of the appliction
	// used when generating the "generator" metadata
	Version string `json:"version,omitempty" yaml:"version,omitempty"`

	// DbName holds the path to the SQLite3 database
	DBName string `json:"db_name,omitempty" yaml:"db_name,omitempty"`

	// Title if this is set the title will be included
	// when generating the markdown of saved items
	Title string `json:"title,omitempty" yaml:"title,omitempty"`

	// Description, included as metadata in head element
	Description string `json:"description,omitempty" yaml:"description,omitempty"`

	// CSS is the path to a CSS file
	CSS string `json:"css,omitempty" yaml:"css,omitempty"`

	// Modules is a list for ES6 diles
	Modules []string `json:"modules,omitempty" yaml:"modules,omitempty"`

	// Header hold the HTML markdup of the Header element. If not included
	// then it will be generated using the Title and timestamp
	Header string `json:"header,omitempty" yaml:"header,omitempty"`

	// Nav holds the HTML markup for navigation
	Nav string `json:"nav,omitempty" yaml:"nav,omitempty"`

	// Footer holds the HTML markup for the footer
	Footer string `json:"footer,omitempty" yaml:"footer,omitempty"`
	// contains filtered or unexported fields
}

Skim2Html supports the skim2html cli.

func NewSkim2Html added in v0.0.22

func NewSkim2Html(appName string) (*Skim2Html, error)

NewSkim2Html initialized a new Skim2Html struct

func (*Skim2Html) DisplayItem added in v0.0.22

func (app *Skim2Html) DisplayItem(link string, title string, description string, enclosures string, updated string, published string, label string, tags string) error

func (*Skim2Html) LoadCfg added in v0.0.22

func (app *Skim2Html) LoadCfg(cfgName string) error

func (*Skim2Html) Run added in v0.0.22

func (app *Skim2Html) Run(out io.Writer, eout io.Writer, args []string) error

func (*Skim2Html) Write added in v0.0.22

func (app *Skim2Html) Write(db *sql.DB) error

Write, display the contents from database

type Skim2Md added in v0.0.5

type Skim2Md struct {
	// AppName holds the name of the application
	AppName string `json:"app_name,omitempty"`

	// DbName holds the path to the SQLite3 database
	DBName string `json:"db_name,omitempty"`

	// Title if this is set the title will be included
	// when generating the markdown of saved items
	Title string `json:"title,omitempty"`

	// FrontMatter, if true insert Frontmatter block in Markdown output
	FrontMatter bool `json:"frontmatter,omitempty"`

	// PocketButton, if true insert a "save to pocket" button for each RSS item output
	PocketButton bool
	// contains filtered or unexported fields
}

Skim2Md supports the skim2md cli.

func NewSkim2Md added in v0.0.5

func NewSkim2Md(appName string) (*Skim2Md, error)

NewSkim2Md initialized a new Skim2Md struct

func (*Skim2Md) DisplayItem added in v0.0.5

func (app *Skim2Md) DisplayItem(link string, title string, description string, enclosures string, updated string, published string, label string, tags string) error

func (*Skim2Md) Run added in v0.0.5

func (app *Skim2Md) Run(out io.Writer, eout io.Writer, args []string, frontMatter bool, pocketButton bool) error

func (*Skim2Md) Write added in v0.0.5

func (app *Skim2Md) Write(db *sql.DB) error

Write, display the contents from database

type Skimmer

type Skimmer struct {
	// AppName holds the name of the application
	AppName string `json:"app_name,omitempty" yaml:"app_name,omitempty"`

	// UserAgent holds the user agent string used by skimmer.
	// Right now I plan to default it to
	//       app.AppName + "/" + app.Version + " (" + ReleaseDate + "." + ReleaseHash + ")"
	UserAgent string `json:"user_agent,omitempty" yaml:"user_agent,omitempty"`

	// DbName holds the path to the SQLite3 database
	DBName string `json:"db_name,omitempty" yaml:"db_name,omitempty"`

	// Urls are the map of urls to labels to be fetched or read
	Urls map[string]*FeedSource `json:"urls,omitempty" yaml:"urls,omitempty"`

	// Limit contrains the number of items shown
	Limit int `json:"limit,omitempty" yaml:"limit,omitempty"`

	// Prune contains the date to use to prune the database.
	Prune bool `json:"prune,omitempty" yaml:"prune,omitempty"`

	// Interactive if true causes Run to display one item at a time with a minimal of input
	Interactive bool `json:"interactive,omitempty" yaml:"interactive,omitempty"`

	// AsUrls, output the skimmer feeds as a newsboat style url file
	AsUrls bool `json:"as_urls,omitempty" yaml:"as_urls,omitempty"`
	// contains filtered or unexported fields
}

Skimmer is the application structure that holds configuration and ties the app to the runner for the cli.

func NewSkimmer

func NewSkimmer(appName string) (*Skimmer, error)

func (*Skimmer) ChannelsToUrls added in v0.0.3

func (app *Skimmer) ChannelsToUrls(db *sql.DB) ([]byte, error)

ChannelsToUrls converts the current channels table to Urls formated output and refreshes app.Urls data structure.

func (*Skimmer) DisplayItem added in v0.0.5

func (app *Skimmer) DisplayItem(link string, title string, description string, enclosures string, updated string, published string, label string, tags string) error

func (*Skimmer) Download

func (app *Skimmer) Download(db *sql.DB) error

Download the contents from app.Urls

func (*Skimmer) ItemCount

func (app *Skimmer) ItemCount(db *sql.DB) (int, error)

ItemCount returns the total number items in the database.

func (*Skimmer) LoadCfg added in v0.0.23

func (app *Skimmer) LoadCfg(userAgent string, limit int, prune bool, interactive bool, asUrls bool) error

LoadCfg takes the command line options, a configuration file and updates the app object.

func (*Skimmer) MarkItem added in v0.0.3

func (app *Skimmer) MarkItem(db *sql.DB, link string, val string) error

func (*Skimmer) PruneItems

func (app *Skimmer) PruneItems(db *sql.DB, pruneDT time.Time) error

PruneItems takes a timestamp and performs a row delete on the table for items that are older than the timestamp.

func (*Skimmer) ReadUrls

func (app *Skimmer) ReadUrls(fName string) error

ReadUrls reads urls or OPML file provided and updates the feeds in the skimmer skimmer file.

Newsboat's url file format is `<URL><SPACE>"~<LABEL>"` one entry per line The hash mark, "#" at the start of the line indicates a comment line.

OPML is documented at http://opml.org

func (*Skimmer) ResetChannels added in v0.0.3

func (app *Skimmer) ResetChannels(db *sql.DB) error

func (*Skimmer) Run

func (app *Skimmer) Run(in io.Reader, out io.Writer, eout io.Writer, args []string) error

Run provides the runner for skimmer. It allows for testing of much of the cli functionality

func (*Skimmer) RunInteractive added in v0.0.3

func (app *Skimmer) RunInteractive(db *sql.DB) error

RunInteractive provides a sliver of interactive UI, basically displaying an item then prompting for an action.

func (*Skimmer) Setup

func (app *Skimmer) Setup(fPath string) error

Setup checks to see if anything needs to be setup (or fixed) for skimmer to run.

func (*Skimmer) TagItem added in v0.0.3

func (app *Skimmer) TagItem(db *sql.DB, link string, tag string) error

func (*Skimmer) Write

func (app *Skimmer) Write(db *sql.DB) error

Display the contents from database

Directories

Path Synopsis
cmd
html2skim command
skim2html command
skim2md command
skimmer command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL