INOVAYT
Alternative time series data

You Found Us

  • Are you looking for fresh data to fuel your investment analysis?
  • Do you have software or programmers that can import valuable data and analyze it?
  • If some of the keywords below jump out at you, you are probably in the right place:
      alternative data, time series, stock market, hedge funds, investment, buy side, sell side, data signals, growth trends

Who We Are

INOVAYT, Inc. creates unique datasets that are of high quality, designed for time series analysis, and ideally suited for investments and trading.  Using a combination of machine learning algorithms and human workflow, INOVAYT is able to tackle difficult problems like identifying company aliases.  This recipe can be applied to many different kinds of datasets. "Big data" can then be boiled down to its core value and nicely packaged for assimilation into other systems, such as yours.

Datasets

  • US Trademarks:  This is extremely high quality United States trademark data — cleaned, normalized, mapped to companies, and designed for time series analysis. You can incorporate this data into your existing model or as a standalone model with proven performance.
  • Wikipedia Activity (almost ready for production):  Wikipedia page activity as it relates to publicly traded companies
  • US Patents (almost ready for evaluation):  United States patent appplications as time series data for publicly traded companies
  • US CRE (in research stage):  United States commercial real estate purchases/leases as time series data for publicly traded companies
  • The Planet Dataset (in research stage):  Uniquely valuable data gathered by over 100 satellites
Consider this... "Today's alternative data is tomorrow's fundamental data."
Trademarks are actually a great example of this. Even though a given company's trademarks are not on the books, they are undeniably an indication of a company's innovation, marketing, productization, and what is worth protecting. A company's trademarks are much like a fingerprint or a signature. It is understood that there is value in this data; it's just that it has been too difficult to boil down into intelligible data, until now. INOVAYT has it.
INOVAYT has taken the headaches out of dealing with trademark data. The US government does actually provide for most of the raw data. But, as you might expect, it is riddled with errors, inconsistencies, missing values, rule changes, name mismatches, etc. Just to get started, you would have to deal with over 200GB of unwieldy data (that doesn't even include the images). The company name alias/mismatch issue is a notoriously difficult problem. A given company can have literally dozens of aliases that are associated with its trademarks. They can also be difficult to mine; for example, the trademarks for "Toys R Us" were mostly filed under "Geoffrey" (yes, their giraffe mascot). In its raw form, a trademark is actually a log (with a lot of "unnecessary" data). In our trademarks dataset, everything has been boiled down to just the right calculated fields.
Here are some articles that discuss the value of trademarks.

Sample Data

If you need a large sample, such as for backtesting, please contact us (see below).
Amazon - smooth upward trend
Tesla - relatively few trademarks
Avon Products - fell out of S&P 500

Coverage

DatasetUS Trademarks
CompaniesOver 6,000 tickers are available.
HistoryAs far back as January 1, 2003
Data FrequencyDaily (trademarks are filed every day)
Update FrequencyWeekly (usually on Sunday)
Reporting Lag7 days (knowledge of a trademark filing is typically up to 7 days after the actual filing-date)

Fields

Field NameDescription
tickerThis is the ticker directly associated with the top-level company/CIK at the time the report is generated, according to the SEC.
filing-dateThe date that the trademarks were filed. The format is W3C: YYYY-MM-DD
trademark.countNumber of trademarks filed (this is the primary indicator)
primary-class.countTotal number of primary classes
secondary-class.countTotal number of secondary classes
code-is-text.countNumber of trademarks that contain only text
code-is-drawing-and-text.countNumber of trademarks that contain a drawing/design and text
code-is-text-stylized.countNumber of trademarks that contain a drawing/design of stylized text
code-is-drawing.countNumber of trademarks that contain only a drawing/design
code-is-other.countNumber of trademarks where no drawing is applicable, such as a sound or smell
code-is-unknown.countNumber of trademarks where the code is unknown
text-character.countFor all of the trademarks that include text, this is the total length of all of that text
intent-to-use.countNumber of trademarks that were filed as "Intent to Use"

Backtesting

The US Trademarks for the S&P 100 companies were used for backtesting, covering the last 20 years. Several fairly basic algorithms/methodologies were devised. One of the algorithms (the "trademarks-algorithm") was chosen to report results on; it had the best overall performance, but not by very much (not an outlier). The algorithm is not a result of any machine learning and there is no risk of overfitting.
Using the "trademarks-algorithm":
  • The CAGR (compound annual growth rate) was approximately 1.8 times that of the S&P 500 index.
  • The risk-adjusted growth rate, using the Sortino Ratio as the measure, was approximately 1.4 times that of the S&P 500 index.
  • The CAGR was higher than that of the S&P 500 index in 15 of the 20 years.

Access

  • The US Trademarks dataset is aboard several commercially available distribution platforms. If applicable, Email us with your desired platform and we will let you know how to proceed.
  • Direct access: SFTP (we provide you with authentication credentials)

Contact Us

If you're ready for more information or you are interested in analyzing some companies that we don't have listed, please just Email us.