Google Patents Public Data, provided by IFI CLAIMS Patent Services, is a worldwide bibliographic and US full-text dataset of patent publications.
USPTO Patent Examination Research Data (PatEx) contains detailed information on millions of publicly viewable patent applications filed with the USPTO. The data are sourced from the Public Patent Application Information Retrieval system (Public PAIR).
SureChEMBL Data is a database of compounds extracted from the full text, images and attachments of patent documents.
MultiversX is a highly scalable, secure and decentralized blockchain network created to enable radically new applications, for users, businesses, society, and the new metaverse frontier. This dataset is one of many crypto datasets that are available within Google Cloud Public Datasets . As with other Google Cloud public datasets, you can query this dataset for free, up to 1TB/month of free processing, every month. Watch this short video to learn how to get started with the public datasets.
USPTO Patent Trial and Appeal Board (PTAB) API Data contains data from the PTAB E2E (end-to-end) system making public America Invents Action (AIA) Trials information and documents available.
MAREC Data is a static collection of over 19 million patent applications and granted patents in a unified file format normalized from EP, WO, US, and JP sources, spanning a range from 1976 to June 2008.
Google Patents Research Data contains the output of much of the data analysis work used in Google Patents (patents.google.com), including machine translations of titles and abstracts from Google Translate, embedding vectors, extracted top terms, similar documents, and forward references.
High-demand automotive curated datasets, making it easy to access and discover deep insights into vehicle safety, driver behavior and competitors. Datasets contain historical data sourced from authentic and trusted sources like The National Highway Traffic Safety Administration (NHTSA), the National Center for Statistics and Analysis (NCSA), and the Bureau of Economic Analysis (BEA).
USPTO OCE Patent Claims Research data contains detailed information on claims from U.S. patents granted between 1976 and 2014 and U.S. patent applications published between 2001 and 2014.
Aptos is a Layer 1 blockchain that prioritizes scalability, security, and fast transaction speeds. Aptos utilizes a unique smart contract programming language called Move. Move was originally designed by Meta (formerly Facebook) for their Diem blockchain project and focuses on resource safety and verification. Data freshness can range between minutes to hours depending on chain activity and transaction volumes. Questions? Please reach out to cloud-blockchain-analytics-help@google.com
World Development Indicators Data is the primary World Bank collection of development indicators, compiled from officially-recognized international sources. It presents the most current and accurate global development data available, and includes national, regional and global estimates.
IFI CLAIMS Patent Data Enrichments includes standardized assignee/applicant names and integrated legal status information.
US International Trade Commission 337Info Unfair Import Investigations Information System contains data on investigations done under Section 337. Section 337 declares the infringement of certain statutory intellectual property rights and other forms of unfair competition in import trade to be unlawful practices. Most Section 337 investigations involve allegations of patent or registered trademark infringement.
The NOAA Next Generation Radar (NEXRAD) public dataset on Google Cloud Storage consists of archived Level II weather radar collected from a network of 160 high-resolution Doppler weather radars operated by the NOAA National Weather Service (NWS), the Federal Aviation Administration (FAA), and the U.S. Air Force (USAF) as far back as 1991. These 10 cm S-Band Doppler radars, located in the contiguous United States, Alaska, Hawaii, U.S. territories, and at military base sites, detect atmospheric precipitation and winds. The Level II dataset collected before 2008 contains reflectivity, mean radial velocity, and spectrum width radial volume scans at original resolution (1 degree x 1 km resolution). Starting in 2008, the data is available at super-resolution (0.5 degree x 0.25 km resolution) for some reflectivity tilts. Starting in 2011, radars were upgraded to dual polarization. At that time, differential reflectivity, correlation coefficient, and differential phase data were added. For a complete description of this dataset, see NOAA's NEXRAD documentation . This public dataset is hosted in Google Cloud Storage and available free to use. Use this quick start guide to quickly learn how to access public datasets on Google Cloud Storage.
This dataset surfaces data from the Ethereum blockchain and includes tables for blocks, transactions, logs, and more. Ethereum is a decentralized open-source blockchain system that features its own cryptocurrency, Ether. A blockchain is an ever-growing tree of blocks. Each block contains a number of transactions. For more information, see the Blockchain Analytics documentation .
This dataset surfaces data from the Polygon blockchain and includes tables for blocks, transactions, logs, and more. Polygon is a Layer 2 decentralized, blockchain-based operating system with smart contract functionality, proof-of-stake principles as its consensus algorithm and a cryptocurrency native to the system, known as MATIC. A blockchain is an ever-growing tree of blocks. Each block contains a number of transactions. For more information, see the Blockchain Analytics documentation . This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
Drawing upon recent advances in machine learning and natural language processing, we introduce new tools that automatically ingest, parse, disambiguate and build an updated database using United States patent data. The tools identify unique inventor, assignee, and location entities mentioned on each granted US patent from 1976 to 2016. We describe data flow, algorithms, user interfaces, descriptive statistics, a novelty measure based on the first appearance of a word in the patent corpus, and an automated co-inventor network mapping tool. Balsmeier, B., Assaf, M., Chesebro, T., Fierro, G., Johnson, K., Johnson, S., Li, G., W.S. Lueck, O’Reagan, D., Yeh, W., Zang, G., Fleming, L. “Machine learning and natural language processing applied to the patent corpus.” Forthcoming at Journal of Economics and Management Strategy.
ERA5 is the fifth generation of the European Centre for Medium-Range Weather Forecasts ( ECMWF ) Atmospheric Reanalysis, providing hourly estimates of a large number of atmospheric, land, and oceanic climate variables. This data spans from 1979 to the present, covering the Earth on a 30 km grid and resolves the atmosphere using 137 levels from the surface up to a height of 80 km. A reanalysis is the “most complete picture currently possible of past weather and climate.” Reanalyses are created from assimilation of a wide range of data sources via numerical weather prediction (NWP) models. Meteorologically valuable variables for land and atmosphere were ingested and converted from grib data to Zarr (with no other modifications) to surface a cloud-optimized version of ERA5. In addition, an open-sourced code base is provided to show the providence of the data as well as demonstrate common research workflows. This dataset includes both raw (grib) and cloud-optimized (zarr) files. Use cases. ERA5 data can be used in many different applications, including: Training ML models that predict the impact of weather on different phenomena Training and evaluating ML models that forecast the weather Computing climatologies, the average weather for a region over a given period of time Visualizing and studying historical weather events, such as Hurricane Sandy Thanks to the open data policy of the Copernicus Climate Change and Atmosphere Monitoring Services and ECMWF, this dataset is available free as part of the Google Cloud Public Dataset Program. Please see below for license information.
Bitcoin Cash is a cryptocurrency that allows more bytes to be included in each block relative to it’s common ancestor Bitcoin. This dataset contains the blockchain data in their entirety, pre-processed to be human-friendly and to support common use cases such as auditing, investigating, and researching the economic and financial properties of the system. This dataset is part of a larger effort to make cryptocurrency data available in BigQuery through the Google Cloud Public Datasets program . The program is hosting several cryptocurrency datasets, with plans to both expand offerings to include additional cryptocurrencies and reduce the latency of updates. You can find these datasets by searching "cryptocurrency" in GCP Marketplace. For analytics interoperability, we designed a unified schema that allows all Bitcoin-like datasets to share queries. Interested in learning more about how the data from these blockchains were brought into BigQuery? Looking for more ways to analyze the data? Check out the Google Cloud Big Data blog post and try the sample queries below to get started. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
Create data products around your core business entities (customers, loans, products, etc.) and reuse them across use cases: • Test Data Management • Data Masking • Data Pipelining • GenAI Data Fusion • Customer 360 • MDM • Synthetic Data Generation • Data Migration • Data Tokenization • Operational Intelligence Reusable, scalable, and reliable data products K2view data products package data with everything needed for your authorized consumers to use it – for whatever the use case. K2view data products will allow you to: • Hide the complexities of your underlying sources • Forge a common language between business and IT • Democratize data access • Protect your data for authorized use • Elevate user trust in your company's data
BigQuery Data Transfer Service for Google Ads allows you to automatically schedule and manage recurring load jobs for Google Ads reporting data.
Windsor.ai for BigQuery allows you to get all your marketing and sales data, including Facebook Ads, into BigQuery with a few clicks, and build insightful dashboards.
The platform enables businesses to blend disparate datasets such as sales, finance, marketing, and advertising, to create a single source of truth over business performance. Through automated connectivity to hundreds of data sources and destinations, unrivalled data transformation options, and powerful data governance features, Adverity is the easiest way to get your data how you want it, where you want it, and when you need it.
Pelican is an AI-powered enterprise-scale technology to compare, validate and reconcile datasets across two heterogeneous data stores at a Petabyte scale. This includes table/views & column comparison, cell-level validation, selective column mapping, even mapping tables and columns with different data types or names, data lineage visualization, and ability to get all the columns with mismatches in single iteration. The validation can be achieved, without coding or data movement. Try and Buy is only available for two weeks For Enterprise plan please contact sales.pelican@datametica.com
Y42's Turnkey Data Orchestration Platform offers a unified space to build, monitor, and maintain a robust flow of data, essential for powering your business. With Y42, you can ingest, transform, test, and automate your data on a unified architecture, facilitating seamless workflows among all pipeline components. Its build-in monitoring capabilities and git changelogs ensure that bad data never enters production and makes catching and fixing errors a breeze.
Google Genomics helps the life science community organize the world’s genomic information and make it accessible and useful. Through our extensions to Google Cloud Platform, you can apply the same technologies that power Google Search and Maps to securely store, process, explore, and share large, complex datasets. Big genomic data is here today, with petabytes rapidly growing toward exabytes. Query the complete genomic information of large research projects in seconds. Process as many genomes and experiments as you like in parallel. Google Genomics supports open industry standards, including those developed by the Global Alliance for Genomics and Health, so you can share your tools and data with your group, collaborators, or the broader community, if and when you choose. Google's infrastructure provides reliable information security that can meet or exceed the requirements of HIPAA and protected health information.
NEAR is a user-friendly and carbon-neutral blockchain, built from the ground up to be performant, secure, and infinitely scalable. In technical terms, NEAR is a layer one , sharded , proof-of-stake blockchain built with usability in mind. In simple terms, NEAR is blockchain for everyone. NEAR Crypto Public Dataset on Google Cloud This dataset for NEAR blockchain contains the source code for ingesting NEAR Protocol data stored as OLAP-formatted text on Google Cloud BigQuery. The source data is streamed and transformed into cleaned and enriched tables. Who is this for? Blockchain data indexing is for anyone who wants to make sense of blockchain data. This includes: Users: create queries to track NEAR assets,monitor transactions, or analyze onchain events at massive scale Researchers: use indexed for data science tasks including onchain activities, identifying trends, or feed AI/ML pipelines for predective analysis Startups: can use NEAR's indexed data for deep insights on user engagement, smart contract utilization, or insights across tokens and NFT adoption Benefits Near instant insights: Historical onchain data queried at scale Cost-effective:eliminate the need to store and process bulk NEAR protocol data; query as little or as much data as preferred Easy to use: no prior experience with blockchain technology required; bring a general knowledge of SQL to unlock insights. Extract to powerful collaborative services such as Google Connected Sheets , Looker Data Studio , or integrate to third party tools like Tableau
Open Targets Genetics is a comprehensive data integration tool that highlights variant-centric statistical evidence to allow both prioritisation of candidate causal variants at trait-associated loci and identification of potential drug targets. By aggregating and integrating human GWAS data, including data from relevant biobanks, and functional genomics data, Open Targets Genetics makes robust connections between GWAS-associated loci, variants and likely causal genes. This enables systematic identification and prioritisation of likely causal variants and genes across all published trait-associated loci. To learn more about Open Targets Genetics, read our documentation , join the Open Targets Community , or email us at helpdesk@opentargets.org . This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
Merkle’s Data Accelerator quickly integrates disparate data sources to one centralized location. Using tools such as Google's BigQuery, it enables you to explore, query, report, visualize, and analyze your data quickly. This SaaS approach allows you to start small by solving your current data management needs, and then expand as your needs evolve. DATA ACCELERATOR WILL HELP YOU - Match individuals across disparate data sources using Merkury, Merkle's cookie-less identity resolution platform - Comply with CCPA and GDPR regulations - Build better customer and prospect audiences based on activities, events and demographics - Accelerate into cloud-based data management, then customize to support your expanded needs over time - Generate analytics and reporting - Connect your data to other Marketing Clouds RESULTS OUR CLIENTS ARE SEEING: - 40% time saved on analytics by automated consolidation of data sources - 25% quicker data onboarding, as compared to standalone cloud solution - 32% decrease of duplicative customer/prospect records after identity processing Note: If interested in Data Accelerator, please contact Merkle (marketing@merkleinc.com) for additional details including setup process and detailed pricing.
The Global Forecast System (GFS) is a weather forecast model produced by the National Centers for Environmental Prediction (NCEP). The GFS dataset consists of selected model outputs (described below) as gridded forecast variables. The 384-hour forecasts, with 3-hour forecast interval, are made at 6-hour temporal resolution (i.e. updated four times daily). Use the 'creation_time' and 'forecast_time' properties to select data of interest. The GFS is a coupled model, composed of an atmosphere model, an ocean model, a land/soil model, and a sea ice model which work together to provide an accurate picture of weather conditions. See history of recent modifications to the global forecast/analysis system , the model performance statistical web page , and the documentation homepage for more information.
DataSunrise Database Security is a cross-platform high-performance software that secures databases and data in real-time. DataSunrise defends databases (both SQL and NoSQL) and protects companies' sensitive data from outside threats and internal security breaches. DataSunrise Database Security includes an intelligent Database Firewall, Data Masking (both Dynamic and Static), Sensitive Data Discovery and Data Audit (or Database Activity Monitoring - DAM). DataSunrise Database Firewall (Database Security module) monitors all queries from client's applications or users to the database that is protected by DataSunrise in real-time. All popular databases and data warehouses are supported, including Google Cloud SQL, Oracle, MS SQL Server, MySQL, MariaDB, Greenplum, PostgreSQL, DB2, Teradata, Netezza, Vertica, Mongo DB, SAP Hana and others. DataSunrise Dynamic Data Masking protects data from any unwanted requests by masking and obfuscating sensitive data in real-time. DataSunrise Database Security prevents SQL injection attacks by intelligently detecting SQL injections and immediately blocking malicious queries. DataSunrise Security empowers Database Audit and Protection. Consider DataSunrise to be in compliance with database specific regulations, such as SOX, PCI, GDPR, HIPAA, and ISO/IEC 27001. DataSunrise helps to meet these requirements for monitoring, tracking and reporting the database activity to meet auditor's demands. Security operation teams have to face an enormous challenge to ensure optimum security of a database. The ultimate desire for any hostile attacker is getting an access to organization's databases that store strategic business data. DataSunrise is the right security solution for organizations. Usage Instructions: Use your web browser to access the DataSunrise GUI at https://vm_public_ip:11000. Use the login 'admin' and the password generated during VM deploying.
The D&B ESG Intelligence database with industry-leading coverage of public and private company data from the Dun & Bradstreet Data Cloud and analytics built in alignment with leading ESG frameworks and standards, helps companies leverage actionable ESG intelligence to manage risk and increase supply chain resiliency. Procurement leaders, investment leaders and insurance and banking professionals can utilize D&B ESG Intelligence database for: Managing the supply chain using ESG data to help evaluate and onboard suppliers. Leveraging ESG data in the investment process to screen companies for risks and report fund performance to stakeholders. Informing business strategy with ESG data to increase financial performance and enhance company reputation. Utilizing ESG data in actuarial models and credit models to evaluate risks and opportunities. Key benefits that make D&B ESG Intelligence stand out are: Deep Coverage: business data coupled with ESG-related information on 80+ million public and private companies in 176 countries and territories, and growing Standardized Rankings: ESG Rankings are based off leading sustainability frameworks, such as SASB, GRI, The UN SDGs, TCFD, UN PRI, and CDP. Strategic Metrics: insights on companies, including a consistent outperformance for companies with good ESG Rankings Frequent Updates: data is updated weekly and distributed monthly to provide the most up-to-date snapshot on companies Tested and proven, Dun & Bradstreet’s ESG scores and ratings are easy to add to existing workflows to make an immediate impact on the quality of your risk management, supply chain resiliency and business performance decisions.
As the world’s premier data-first security solutions provider, Protegrity has set the standard in precision data protection. A typical enterprise deals with an eye-watering and growing amount of sensitive data, spread across regions, providers, and applications. Protegrity protects sensitive data at rest, in motion, and most importantly, enables data to be safely shared to power true business transformation. Customers can then focus on growth, development, and optimization, secure in the knowledge that their data remains secure and compliant across their enterprise. Precision data protection is the next frontier in data security, compliments zero trust methodologies and allows the power of data collaboration to be harnessed. Headquartered in Salt Lake City, Utah, USA, with regional offices around the world, Protegrity is the enterprise data protection partner of choice. For more information or to structure a custom offer, please reach out to your GCP Account Manager or Protegrity directly at gcpmarketplace@protegrity.com
This data includes all San Francisco 311 service requests from July 2008 to the present, and is updated daily. 311 is a non-emergency number that provides access to non-emergency municipal services. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
The Synthea Generated Synthetic Data in FHIR hosts over 1 million synthetic patient records generated using Synthea in FHIR format. Exported from the Google Cloud Healthcare API FHIR Store into BigQuery using analytics schema . This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery . This public dataset is also available in Google Cloud Storage and available free to use. The URL for the GCS bucket is gs://gcp-public-data--synthea-fhir-data-1m-patients. Use this quick start guide to quickly learn how to access public datasets on Google Cloud Storage. Please cite SyntheaTM as: Jason Walonoski, Mark Kramer, Joseph Nichols, Andre Quina, Chris Moesel, Dylan Hall, Carlton Duffett, Kudakwashe Dube, Thomas Gallagher, Scott McLachlan, Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record, Journal of the American Medical Informatics Association, Volume 25, Issue 3, March 2018, Pages 230–238, https://doi.org/10.1093/jamia/ocx079
Rill makes it effortless to transform your datasets with SQL and create powerful, opinionated dashboards your users will actually use. What makes it different: -- feels good to use: powered by Sveltekit & DuckDB = conversation-fast, not wait-ten-seconds-for-result-set fast -- works with your local and remote datasets: imports and exports Parquet and CSV (s3, gcs, https, local) -- faster insights, easier to use: thoughtful, opinionated defaults drive more adoption and get data in the hands of users -- integrated into workflow: with dashboards as code, each step from data to dashboard has versioning, git sharing, and easy rehydration To speak with someone about Rill Enterprise for large scale data using Apached Druid, please contact google-cloud@rilldata.com
Cloud Functions for Firebase allows you to extend and connect Firebase and cloud services with code. You can run your mobile backend code that automatically responds to events triggered by Firebase features and HTTPS requests without the need to manage and scale your own servers. Cloud Functions is a hosted, private, and scalable Node.js environment where you can run JavaScript code. Firebase SDK for Cloud Functions integrates the Firebase platform by letting you write code that responds to events and invokes functionality exposed by other Firebase features.
Cloud Spanner is a fully managed, mission-critical, relational database service that offers transactional consistency at global scale, schemas, SQL (ANSI 2011 with extensions), and automatic, synchronous replication for high availability. It is the first and only relational database service that is both strongly consistent and horizontally scalable. Cloud Spanner is ideal for relational, structured, and semi-structured data that requires high availability, strong consistency, and transactional reads and writes. It offers all the traditional benefits of a relational database – such as ACID transactions and SQL semantics – but unlike any other relational database service, Cloud Spanner scales horizontally, to hundreds or thousands of servers, so it can handle the highest of transactional workloads. With automatic scaling, synchronous data replication, and node redundancy, Cloud Spanner delivers up to 99.999% (five 9s) of availability for your mission-critical applications. In fact, Google’s internal Spanner service has been handling millions of queries per second from many Google services for years.
TapClicks data management platform provides agencies, media companies and brands with omnichannel reporting and analytics. Connect over 250 data sources with our instant on connectors plus an unlimited number of data sources with our smart connector. Compare online and offline data, create automated reporting and determine the best campaign optimizations all in a single dashboard. Every company is experiencing digital transformation. All of that data creates both an opportunity and a challenge for agencies. With great data comes great responsibility. With TapClicks, you are able to continue to scale your business while providing all of the reporting and analysis that your customers demand. Find out how we can transform your business today. TapClicks offers custom packages for all Google Marketplace customers. Please contact gcp@tapclicks.com to create a custom package specifically for your company.
Gauss Data Quality is a 4 in 1 technology that empowers businesses to harness the power of data, optimize operations, and achieve their strategic goals. It is an essential investment for any organization seeking to thrive in today's data-driven world. Gauss Data Quality streamlines marketing campaigns, maximizing their impact and enhancing ROI. By safeguarding data accuracy and integrity, it minimizes the risk of costly errors and promotes seamless collaboration between IT and marketing teams. 4 in 1: Event comparer: Compare event data across multiple platforms and identify discrepancies Data Consistency: Detects and warns you of outliers in your data Anomaly Insights: Bring business insights through detecting anomalies in your time series data Marketing Data Layer Validation: Total control over marketing data layer in one single dashboard
Google Cloud Console has failed to load JavaScript sources from www.gstatic.com.
Possible reasons are: