webmaster@21cto.com

登录注册

使用 PostgreSQL 简化技术堆栈

万能的大雄

数据库 0 2044 2024-08-14 12:44:32

抽象者的胜利

在开发者们追求创新的过程中，同时也引入了技术栈的复杂性。好消息是，现在想要彻底的简单性也是可行的。

简化技术堆栈、加速开发、降低风险和提供更多功能的一种有效之策略，这便是使用 PostgreSQL 实现多功能的后端一系列功能。

PostgreSQL 如今可以取代多种技术，包括 Kafka、RabbitMQ、MongoDB 和 Redis，可支持多达数百万用户。

使用PostgreSQL的方法会有效简化开发，使应用程序更易于编写、扩展和操作。更少的移动组件意味着开发者可以专注于提供客户价值，无需增加成本即可将功能产出，效率提高 50%。这还可以降低开发者的认知与学习负担，帮助人们深入了解系统，并避免“冒名顶替综合症”。

使用 Postgres 代替 Redis 进行缓存

具有 UNLOGGED 表和 JSON 数据类型的 Postgres

使用 PostgreSQL 进行缓存，这涉及创建 UNLOGGED 表并以 JSON 格式存储数据。

以下是详细的表分类：

UNLOGGED表：

性能：UNLOGGED 表专为提高性能而设计。它们不会将数据写入预写日志 (WAL)，因此它们比常规表更快，但耐用性较差。这对于可以重新生成数据的缓存来说是理想的选择。
持久性：由于这些表跳过了 WAL，因此它们的持久性会降低。如果服务器崩溃，这些表中的数据将会丢失。对于瞬时缓存数据来说，这种权衡是可以接受的。

JSON 数据类型：

灵活性：将数据存储为 JSON 可实现灵活的架构设计。这对于在不改变表架构的情况下缓存各种数据结构非常有用。
查询性能：PostgreSQL 的 JSONB 类型针对读写操作进行了优化，可以高效地查询和索引 JSON 数据。

存储过程：

自动化和维护：存储过程可以自动管理缓存数据。可以编写程序来处理数据过期，确保缓存保持相关性。
集成：使用存储过程您可以将缓存逻辑直接集成到数据库层，从而集中并简化应用程序架构。

过期数据：

TTL 实现：与 Redis 一样，您可以在 PostgreSQL 中使用存储过程实现生存时间 (TTL) 机制来自动删除过时的缓存条目。
自定义过期逻辑：PostgreSQL 允许自定义数据过期逻辑，可以灵活地根据应用程序的需求实施复杂的过期策略。

与 Redis 的比较

以下是 PostgreSQL 和 Redis 在缓存方面的比较：


+------------------+---------------------------------------------+-------------------------------------------+|     Feature      |     PostgreSQL (UNLOGGED Tables & JSON)     |                   Redis                   |+------------------+---------------------------------------------+-------------------------------------------+| Speed            | Fast (due to UNLOGGED tables)               | Extremely fast (in-memory)                || Durability       | Low (data loss on crash)                    | Low (data loss on crash unless AOF/RDB)   || Flexibility      | High (JSON storage)                         | High (supports various data types)        || Complex Queries  | Supports complex queries (SQL)              | Limited complex query support             || Setup Complexity | Higher (requires SQL and stored procedures) | Lower (simple configuration)              || Memory Usage     | Lower (disk-based storage)                  | Higher (in-memory storage)                || Scalability      | Good (scales with hardware)                 | Excellent (built for distributed caching) || Data Expiry      | Custom logic via stored procedures          | Built-in TTL support                      |+------------------+---------------------------------------------+-------------------------------------------+

这对我们有什么好处？

使用带有 UNLOGGED 表和 JSON 数据类型的 PostgreSQL 进行缓存，可以利用单个数据库系统来处理持久数据和缓存数据，从而简化您的堆栈。此方法提供了良好的性能和灵活性，特别适合已经使用 PostgreSQL 的应用程序。但是，对于超快的内存缓存和内置功能（如 TTL），Redis 仍然是更好的选择。决策应基于您的应用程序的特定要求，包括性能需求、基础设施复杂性和可扩展性考虑因素。

使用 Postgres 作为带有 SKIP LOCKED 的消息队列。

在 PostgreSQL 中使用 SKIP LOCKED

PostgreSQL 的 SKIP LOCKED 功能现在可用于实现消息或作业队列。以下是其工作原理以及其有效性：

SKIP LOCKED ：

功能：SKIP LOCKED 子句允许查询跳过当前被其他事务锁定的行。这对于多个工作者需要同时处理作业而不会发生冲突的作业队列非常有用。
并发性：多个工作者可以查询同一张表以查找新作业，并安全地跳过那些已被其他人处理的作业，从而防止争用并确保高效的作业分配。

运行：

消息队列：在此设置中，消息或作业将被插入到 PostgreSQL 表中。工作人员选择并处理作业，完成后将其标记为完成。SKIP LOCKED 子句确保作业仅处理一次。
Go 中的 River 作业队列：Go 中的 River 库可用于管理 PostgreSQL 的作业队列。它就是利用 SKIP LOCKED 高效地处理作业获取和处理。

优点：

简单：使用 PostgreSQL 进行消息队列减少了对额外基础设施的需求，简化了部署和维护。
原子性和一致性：PostgreSQL 提供强大的 ACID 保证，确保可靠的作业处理。
灵活性：您可以利用 SQL 进行复杂的查询和作业管理，从而实现复杂的作业处理逻辑。

与 Kafka 和 RabbitMQ 的比较

以下是使用带有 SKIP LOCKED 的 PostgreSQL 与 Kafka / RabbitMQ 的比较：


+------------------+----------------------------------------+--------------------------------------------------+----------------------------------------------------+|     Feature      |        PostgreSQL (SKIP LOCKED)        |                      Kafka                       |                      RabbitMQ                      |+------------------+----------------------------------------+--------------------------------------------------+----------------------------------------------------+| Durability       | High (ACID compliant)                  | High (persistent logs)                           | High (durable queues)                              || Scalability      | Moderate (scales with hardware)        | Excellent (designed for high throughput)         | Good (supports clustering)                         || Complexity       | Lower (single system for DB and queue) | Higher (requires separate setup)                 | Higher (requires separate setup)                   || Latency          | Moderate (disk-based operations)       | Low (optimized for high throughput)              | Low (optimized for low latency)                    || Throughput       | Moderate                               | High                                             | Moderate to High                                   || Management       | Simpler (single database)              | Complex (requires management of brokers, topics) | Complex (requires management of exchanges, queues) || Message Ordering | Supports ordering via SQL queries      | Supports ordering per partition                  | Supports FIFO queues                               || Use Cases        | Simple queues, job scheduling          | Event streaming, large-scale message processing  | Reliable messaging, complex routing                |+------------------+----------------------------------------+--------------------------------------------------+----------------------------------------------------+


对我有什么好处？

对于需要简单可靠的作业处理且无需管理单独消息代理开销的应用程序，
使用 PostgreSQL 作为带有 SKIP LOCKED 的消息队列是一种可行的选择。但是，对于需要高吞吐量、低延迟和复杂路由的应用程序，Kafka 和 RabbitMQ 等专用系统可能更合适。选择取决于特定的应用程序要求，包括性能、可扩展性和基础设施复杂性。

使用 Postgres 和 Timescale 作为数据仓库

将 Postgres 与 TimescaleDB 结合使用

TimescaleDB 是针对处理时间序列数据而优化的 PostgreSQL 扩展，使其成为数据仓库的强大工具。下面是详细说明：

时间序列数据处理：

分区：TimescaleDB 根据时间间隔自动将数据分区为块。这提高了查询性能并简化了数据管理。
压缩：内置压缩功能可减少存储要求并提高 I/O 性能。

可扩展性：

超表：TimescaleDB 引入了超表，它充当单个表，但内部进行分区，以获得更好的性能和可扩展性。
集群支持：多节点支持允许水平扩展，跨多个节点分发数据和查询。

与 PostgreSQL 集成：

SQL 兼容性：TimescaleDB 保留了完整的 SQL 支持，支持使用标准 PostgreSQL 工具和扩展。
生态系统：与 PostgreSQL 生态系统无缝集成，利用其工具和功能提供全面的数据仓库解决方案。

与其他分析型 OLAP 数据库的比较

以下是 Postgres 与 TimescaleDB 与 ClickHouse 和 Greenplum 的比较：


+-----------------------+----------------------------------+--------------------------------------------------+
|        Feature        |        PostgreSQL (JSONB)        |                     MongoDB                      |
+-----------------------+----------------------------------+--------------------------------------------------+
| Data Format           | Binary JSON (JSONB)              | BSON (Binary JSON)                               |
| SQL Support           | Full SQL support                 | No SQL support (uses MongoDB Query Language)     |
| Indexing              | Supports GIN, B-tree, and others | Supports various index types                     |
| Query Performance     | High (optimized with indexing)   | High (optimized for document queries)            |
| Schema Flexibility    | High (schema-less, flexible)     | High (schema-less, flexible)                     |
| Transactions          | ACID-compliant transactions      | Multi-document ACID transactions                 |
| Scalability           | Good (scales with hardware)      | Excellent (built for horizontal scaling)         |
| Replication           | Supported (with built-in tools)  | Supported (with built-in tools)                  |
| Community and Support | Strong (PostgreSQL community)    | Strong (active community and enterprise support) |
| Data Aggregation      | Powerful SQL-based aggregation   | Powerful aggregation framework                   |
+-----------------------+----------------------------------+--------------------------------------------------+

这对我们有什么好处？

使用 PostgreSQL 和 JSONB 来存储、搜索和索引 JSON 文档，为需要灵活架构设计和强 ACID 合规性的应用程序提供了强大的解决方案。

PostgreSQL 的 SQL 功能与高效的 JSONB 索引和查询相结合，使其成为一种多功能的选择。但是，MongoDB 提供了专门的面向文档的功能和出色的水平扩展，使其更适合需要大规模分布式数据管理的应用程序。

在 PostgreSQL 和 MongoDB 之间进行选择应基于对查询复杂性、可扩展性和事务完整性的特定要求。

使用 Postgres 作为 pg_cron 的 Cron 守护进程

在 PostgreSQL 中使用 pg_cron

pg_cron：

调度任务：pg_cron 是 PostgreSQL 扩展，允许调度 SQL 查询在特定时间运行，类似于 Linux 中的 cron 作业。
集成：可以在 PostgreSQL 环境中直接定义任务，利用 SQL 执行各种操作，例如数据更新、维护任务或触发发送电子邮件等外部操作。

消息队列集成：

将事件添加到队列：pg_cron 可用于安排将条目添加到消息队列的事件。这可以与 PostgreSQL 的通知系统 (LISTEN/NOTIFY) 结合使用，以触发实时操作。

优点：

简单：在数据库内集中进行任务调度，减少了对外部 cron 管理的需要。
事务完整性：确保任务执行符合 ACID 要求，保持数据一致性。

与其他批处理系统的比较

以下是使用 PostgreSQL 和 pg_cron 与 Spring Batch 和 Linux cron 的比较：

+-----------------------+------------------------------------------------+------------------------------------------+----------------------------------------+
|        Feature        |            PostgreSQL with pg_cron             |               Spring Batch               |               Linux cron               |
+-----------------------+------------------------------------------------+------------------------------------------+----------------------------------------+
| Scheduling            | Integrated with SQL-based scheduling           | Framework for batch processing           | Basic time-based job scheduling        |
| Ease of Use           | Easy (SQL-based, simple setup)                 | Moderate (requires Java setup)           | Easy (simple syntax, widely known)     |
| Scalability           | Good (scales with PostgreSQL)                  | Excellent (scales with Java apps)        | Moderate (limited by system resources) |
| Transaction Support   | Full ACID compliance                           | Supports transactions                    | No transactional support               |
| Complexity            | Low (simple SQL queries)                       | High (requires coding and configuration) | Low (simple script execution)          |
| Dependency Management | Integrated within database                     | External dependencies managed via Java   | No dependency management               |
| Error Handling        | Robust (database-level handling)               | Robust (framework-level handling)        | Basic (logs, exit codes)               |
| Job Types             | SQL queries, database tasks, external triggers | Complex batch processing, ETL            | Script execution, command-line tasks   |
| Notifications         | Built-in (LISTEN/NOTIFY, triggers)             | Custom implementations                   | Basic (email notifications)            |
+-----------------------+------------------------------------------------+------------------------------------------+----------------------------------------+

这对我有什么好处？

使用 PostgreSQL 和 pg_cron 来调度任务和管理批处理，为以数据库为中心的应用程序提供了一种简化和集成的方法。它确保事务完整性并利用 PostgreSQL 的功能进行任务调度。但是，对于更复杂的批处理和 ETL 任务，Spring Batch 等框架提供了更高级的功能和可扩展性。

Linux cron 仍然是基于时间的基本作业调度的简单有效的解决方案，适用于不太复杂的要求。选择取决于任务的复杂性、对事务完整性的需求以及现有的基础设施。

使用 Postgres 进行地理空间查询

将 PostGIS 与 PostgreSQL 结合使用

PostGIS 扩展：

地理空间数据类型：PostGIS 为 PostgreSQL 添加了对地理对象的支持，使数据库能够存储、查询和操作空间数据类型。
函数和索引：它为空间查询和空间索引（例如使用 GIST 的 R 树）提供了一套全面的函数，以提高性能。

功能：

复杂查询：支持复杂的地理空间查询，包括距离计算、面积计算和空间关系（例如交叉、包含）。
标准合规性：遵守 OGC（开放地理空间联盟）标准，确保与其他地理空间系统兼容。

与其他地理空间系统的比较

以下是使用 PostgreSQL 与 PostGIS 与 Maptitude、ArcGIS 和 Mapline 的比较：


+-----------------------+------------------------------------------+----------------------------------------+------------------------------------------+--------------------------------------+
|        Feature        |         PostgreSQL with PostGIS          |               Maptitude                |                  ArcGIS                  |               Mapline                |
+-----------------------+------------------------------------------+----------------------------------------+------------------------------------------+--------------------------------------+
| Data Types            | Supports complex geospatial data types   | Primarily vector and raster data       | Comprehensive (vector, raster, etc.)     | Basic geospatial data                |
| Query Language        | SQL with geospatial extensions           | GUI-based analysis, limited scripting  | Comprehensive scripting (Python, etc.)   | Limited query capabilities           |
| Scalability           | High (scales with PostgreSQL)            | Moderate (depends on system resources) | High (enterprise-level scalability)      | Moderate (cloud-based scalability)   |
| Ease of Use           | Moderate (requires SQL knowledge)        | Easy (user-friendly interface)         | Moderate to High (advanced features)     | Easy (user-friendly interface)       |
| Cost                  | Free (open-source)                       | Proprietary (requires purchase)        | Proprietary (requires purchase)          | Subscription-based                   |
| Integration           | Seamless with PostgreSQL and other tools | Standalone software                    | Extensive integration with ESRI products | Integrates with various data sources |
| Functionality         | Advanced spatial functions and analysis  | Basic to advanced spatial analysis     | Advanced spatial analysis and modeling   | Basic spatial analysis               |
| Community and Support | Strong open-source community             | Good customer support                  | Excellent customer support and community | Good customer support                |
+-----------------------+------------------------------------------+----------------------------------------+------------------------------------------+--------------------------------------+

这对我有什么好处？

使用 PostgreSQL 和 PostGIS 进行地理空间查询提供了一种功能强大、可扩展且经济高效的空间数据管理和分析解决方案。它特别适合需要高级空间查询和与现有 PostgreSQL 数据库集成的应用程序。

但是，对于专门的地理空间分析和用户友好界面，Maptitude、ArcGIS 和 Mapline 等商业系统可能会提供更多定制功能和更好的易用性。选择取决于对功能、成本和易用性的特定需求。

使用 Postgres 进行全文搜索

PostgreSQL 中的全文搜索

TSVector 和 TSQuery：

数据类型：PostgreSQL 使用 tsvector 和 tsquery 数据类型来存储和查询全文数据。
索引：可以使用GIN（通用倒排索引）和GIST（通用搜索树）索引来提高搜索性能。

全文搜索功能：

搜索功能：PostgreSQL 支持复杂的搜索查询、排名和搜索结果加权。
词典支持：支持词干、停用词和同义词词典，以提高搜索准确性。

与其他全文搜索引擎的比较

以下是使用 PostgreSQL 进行全文搜索与使用 Elastic、Solr、Lucene 和 Sphinx 的详细比较：

这对我有什么好处？

使用 PostgreSQL 进行全文搜索可为已使用 PostgreSQL 的应用程序提供强大而集成的解决方案，充分利用 SQL 功能并确保数据一致性。但是，ElasticSearch、Solr、Lucene 和 Sphinx 等专用搜索引擎可提供专门的功能、高级搜索功能以及针对大规模分布式环境的更好可扩展性。选择取决于搜索要求的复杂性、数据规模以及与现有系统集成的需求。

使用 Postgres 为 API 生成 JSON

使用 PostgreSQL 生成 JSON

JSON 函数：

json_build_object：在 SQL 查询中直接创建 JSON 对象。
json_agg：将 SQL 查询结果聚合到 JSON 数组中。

优点：

消除中间件：直接在 PostgreSQL 中生成 JSON 可以消除服务器端代码为 API 格式化数据的需要。
性能：通过最小化数据库和应用程序层之间的数据传输和处理来减少延迟。
简化架构：通过集中数据库内的数据转换逻辑来简化开发。

与 Firebase 和其他后端服务的比较

以下是 PostgreSQL 的 JSON 生成功能与 Firebase 和其他后端服务的详细比较：

+------------------------+--------------------------------------+-----------------------------------------+----------------------------------------------------+|        Feature         |     PostgreSQL (JSON Generation)     |                Firebase                 |               Other Backend Services               |+------------------------+--------------------------------------+-----------------------------------------+----------------------------------------------------+| Data Transformation    | In-database JSON generation          | JSON storage and retrieval              | Varies (often requires server-side code)           || Performance            | High (reduced data transfer)         | High (real-time updates)                | Varies                                             || Ease of Use            | Moderate (requires SQL knowledge)    | Easy (user-friendly interface)          | Varies (can require extensive setup)               || Scalability            | Good (scales with PostgreSQL)        | Excellent (built for scalability)       | Varies                                             || Integration            | Seamless within PostgreSQL ecosystem | Easy integration with Firebase services | Varies (integration complexity depends on service) || Cost                   | Free (open-source)                   | Freemium (free tier with paid options)  | Varies (can include free and paid tiers)           || Real-time Capabilities | Limited (requires triggers)          | Built-in real-time database             | Varies                                             || Security               | High (PostgreSQL security features)  | High (Firebase security rules)          | Varies (depends on implementation)                 |+------------------------+--------------------------------------+-----------------------------------------+----------------------------------------------------+

这对我有什么好处？

对于可降低复杂性并提高性能的应用程序而言，使用 PostgreSQL 为 API 生成 JSON 是一种强大的方法。

它通过在数据库中直接创建可用于 API 的 JSON 数据，消除了对额外服务器端处理的需要。但是，对于实时功能、无缝集成和用户友好界面，Firebase 等后端服务可能更合适。

选择取决于应用程序的具体需求，包括性能要求、实时数据处理和集成的简易性。

使用 Postgres 和 pgaudit 进行审计

在 PostgreSQL 中使用 pgaudit

pgaudit 扩展：

功能：pgaudit（PostgreSQL 审计）提供数据库活动的详细日志，包括 SELECT、INSERT、UPDATE、DELETE 和 DDL 命令。
配置：允许对记录的事件进行细粒度的控制，帮助监控和审核数据库操作以满足合规性和安全性目的。

优点：

综合日志记录：捕获各种数据库事件。
合规性：通过提供详细的审计跟踪帮助满足监管要求。
集成：与 PostgreSQL 环境无缝协作。

与其他变更数据捕获解决方案的比较

以下是使用 PostgreSQL 与 pgaudit 与 Hibernate Envers 和 Debezium 的详细比较：


+-----------------------+---------------------------------------+---------------------------------------+---------------------------------------------+|        Feature        |        PostgreSQL with pgaudit        |           Hibernate Envers            |                  Debezium                   |+-----------------------+---------------------------------------+---------------------------------------+---------------------------------------------+| Purpose               | Auditing database activities          | Versioning of entity changes          | Change Data Capture (CDC)                   || Integration           | Integrated with PostgreSQL            | Integrated with Hibernate ORM         | Connects to databases via Kafka             || Granularity           | SQL command-level logging             | Entity-level versioning               | Row-level changes in database               || Setup Complexity      | Moderate (requires PostgreSQL config) | Moderate (requires Hibernate config)  | High (requires Kafka setup)                 || Performance Impact    | Moderate (logging overhead)           | Low to moderate (depends on usage)    | Low to moderate (depends on volume)         || Use Cases             | Security, compliance, auditing        | Auditing application-level changes    | Data integration, microservices             || Real-time Processing  | Limited (log-based)                   | No                                    | Yes (real-time CDC)                         || Historical Data       | Detailed logs of database activity    | Entity version history                | Captures data changes in real-time          || Cost                  | Free (open-source)                    | Free (open-source, part of Hibernate) | Free (open-source, part of Kafka ecosystem) || Community and Support | Strong (PostgreSQL community)         | Strong (Hibernate community)          | Strong (Debezium and Kafka communities)     |+-----------------------+---------------------------------------+---------------------------------------+---------------------------------------------+

这对我有什么好处？

使用 PostgreSQL 和 pgaudit 进行审计提供了一种强大的解决方案，可以捕获数据库活动的详细日志，有助于满足合规性和安全性要求。它特别适合需要全面数据库级审计的环境。

Hibernate Envers 是应用程序级更改跟踪的理想选择，而 Debezium 则擅长实时捕获用于数据集成和微服务的更改数据。选择取决于特定的审计和更改跟踪要求、性能考虑因素和集成需求。

使用带有 GraphQL 适配器的 Postgres

GraphQL 适配器：

功能：GraphQL 适配器允许 PostgreSQL 数据库直接提供 GraphQL 查询。它们将 SQL 操作映射到 GraphQL 操作，从而提供灵活的 API 层。
优点：此设置利用 GraphQL 的灵活性和强大功能，提供了一种统一而有效的方法来查询关系数据。

一体化：

直接映射：适配器可以将数据库表和关系直接映射到 GraphQL 类型和解析器，从而无需大量的服务器端代码即可实现动态查询执行。
易于使用：通过减少对中间件和自定义解析器逻辑的需求来简化 API 开发。

与其他 GraphQL 适配器的比较

以下是使用带有 GraphQL 适配器的 PostgreSQL 与 Prisma ORM 和 Apollo GraphQL 之详细比较：

+------------------------+-----------------------------------------+-------------------------------------------+-------------------------------------------+|        Feature         |     PostgreSQL with GraphQL Adapter     |                Prisma ORM                 |              Apollo GraphQL               |+------------------------+-----------------------------------------+-------------------------------------------+-------------------------------------------+| Database Integration   | Direct PostgreSQL integration           | Supports multiple databases               | Requires a separate data source           || Ease of Use            | High (direct mapping of SQL to GraphQL) | High (schema-based approach)              | Moderate (requires resolver functions)    || Performance            | High (optimized for PostgreSQL)         | High (efficient query generation)         | High (optimized for GraphQL queries)      || Flexibility            | Moderate (relies on PostgreSQL schema)  | High (customizable schema and resolvers)  | High (customizable resolvers)             || Schema Management      | Automatic based on database schema      | Managed via Prisma schema                 | Managed via GraphQL schema                || Real-time Capabilities | Limited (depends on implementation)     | Good (supports real-time updates)         | Excellent (supports subscriptions)        || Setup Complexity       | Low to moderate (simple setup)          | Moderate (requires Prisma setup)          | Moderate to high (requires Apollo setup)  || Community and Support  | Strong (PostgreSQL community)           | Growing (active community)                | Strong (active community and support)     || Cost                   | Free (open-source)                      | Free (open-source, with premium features) | Free (open-source, with premium features) |+------------------------+-----------------------------------------+-------------------------------------------+-------------------------------------------+