概括
- rust 写的,快,比go都快,轻量级,java的东西又臭又费内存
- es支持的常用查询,美丽search 大部分都支持,够用
- 适合单机部署,不代表不能分布式部署,数据量几百万完全够
- 分页效果比es强,搜索排序和es的打分机制精准度五五开
- 再技术市场上再沉淀几年,才能被开发者所认可
安装
docker pull getmeili/meilisearch:v1.8
docker run -d --name meilisearch -p 7700:7700 -e MEILI_MASTER_KEY='meilisearch-api' getmeili/meilisearch
其中MEILI_MASTER_KEY
为一个自定义的密钥,像密码一样,当然,你可以不设置
浏览器打开 http://127.0.0.1:7700/
输入上面密钥即可
这个是一个控制台,可用来调试
官方提供了模拟数据使用
下载模拟数据
curl https://www.meilisearch.com/movies.json -O
新建索引,并导入数据
curl -X POST "http://localhost:7700/indexes/movies/documents?primaryKey=id" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer meilisearch-api" \
--data-binary @movies.json
这样就可以在 http://127.0.0.1:7700/
看到效果了,当然也可以在程序中查询数据
配置说明
meilisearch 的配置项可通过-h
参数查看支持项
官网关于配置项的介绍地址
https://www.meilisearch.com/docs/learn/self_hosted/configure_meilisearch_at_launch#command-line-options-and-flags
由于官网的介绍过于详细化,这里我全部列出来方便查看
/meili_data # /bin/meilisearch -h
Usage: meilisearch [OPTIONS]
Options:
--config-file-path <CONFIG_FILE_PATH>
Set the path to a configuration file that should be used to setup the engine. Format must be TOML
--db-path <DB_PATH>
Designates the location where database files will be created and retrieved [env: MEILI_DB_PATH=] [default: ./data.ms]
--dump-dir <DUMP_DIR>
Sets the directory where Meilisearch will create dump files [env: MEILI_DUMP_DIR=] [default: dumps/]
--env <ENV>
Configures the instance's environment. Value must be either `production` or `development` [env: MEILI_ENV=] [default: development] [possible values: development, production]
--experimental-contains-filter
Experimental contains filter feature. For more information, see: <https://github.com/orgs/meilisearch/discussions/763> [env: MEILI_EXPERIMENTAL_CONTAINS_FILTER=]
--experimental-drop-search-after <EXPERIMENTAL_DROP_SEARCH_AFTER>
Experimental drop search after. For more information, see: <https://github.com/orgs/meilisearch/discussions/783> [env: MEILI_EXPERIMENTAL_DROP_SEARCH_AFTER=] [default: 60]
--experimental-dumpless-upgrade
Experimental dumpless upgrade. For more information, see: <https://github.com/orgs/meilisearch/discussions/804> [env: MEILI_EXPERIMENTAL_DUMPLESS_UPGRADE=]
--experimental-embedding-cache-entries <EXPERIMENTAL_EMBEDDING_CACHE_ENTRIES>
Enables experimental caching of search query embeddings. The value represents the maximal number of entries in the cache of each distinct embedder [env: MEILI_EXPERIMENTAL_EMBEDDING_CACHE_ENTRIES=] [default: 0]
--experimental-enable-logs-route
Experimental logs route feature. For more information, see: <https://github.com/orgs/meilisearch/discussions/721> [env: MEILI_EXPERIMENTAL_ENABLE_LOGS_ROUTE=]
--experimental-enable-metrics
Experimental metrics feature. For more information, see: <https://github.com/meilisearch/meilisearch/discussions/3518> [env: MEILI_EXPERIMENTAL_ENABLE_METRICS=]
--experimental-limit-batched-tasks-total-size <EXPERIMENTAL_LIMIT_BATCHED_TASKS_TOTAL_SIZE>
Experimentally reduces the maximum total size, in bytes, of tasks that will be processed at once, see: <https://github.com/orgs/meilisearch/discussions/801> [env: MEILI_EXPERIMENTAL_LIMIT_BATCHED_TASKS_SIZE=] [default: 18446744073709551615]
--experimental-logs-mode <EXPERIMENTAL_LOGS_MODE>
Experimental logs mode feature. For more information, see: <https://github.com/orgs/meilisearch/discussions/723> [env: MEILI_EXPERIMENTAL_LOGS_MODE=] [default: HUMAN]
--experimental-max-number-of-batched-tasks <EXPERIMENTAL_MAX_NUMBER_OF_BATCHED_TASKS>
Experimentally reduces the maximum number of tasks that will be processed at once, see: <https://github.com/orgs/meilisearch/discussions/713> [env: MEILI_EXPERIMENTAL_MAX_NUMBER_OF_BATCHED_TASKS=] [default: 18446744073709551615]
--experimental-nb-searches-per-core <EXPERIMENTAL_NB_SEARCHES_PER_CORE>
Experimental number of searches per core. For more information, see: <https://github.com/orgs/meilisearch/discussions/784> [env: MEILI_EXPERIMENTAL_NB_SEARCHES_PER_CORE=] [default: 4]
--experimental-no-snapshot-compaction
Experimental no snapshot compaction feature [env: MEILI_EXPERIMENTAL_NO_SNAPSHOT_COMPACTION=]
--experimental-reduce-indexing-memory-usage
Experimental RAM reduction during indexing, do not use in production, see: <https://github.com/meilisearch/product/discussions/652> [env: MEILI_EXPERIMENTAL_REDUCE_INDEXING_MEMORY_USAGE=]
--experimental-replication-parameters
Enable multiple features that helps you to run meilisearch in a replicated context. For more information, see: <https://github.com/orgs/meilisearch/discussions/725> [env: MEILI_EXPERIMENTAL_REPLICATION_PARAMETERS=]
--experimental-search-queue-size <EXPERIMENTAL_SEARCH_QUEUE_SIZE>
Experimental search queue size. For more information, see: <https://github.com/orgs/meilisearch/discussions/729> [env: MEILI_EXPERIMENTAL_SEARCH_QUEUE_SIZE=] [default: 1000]
-h, --help
Print help (see more with '--help')
--http-addr <HTTP_ADDR>
Sets the HTTP address and port Meilisearch will use [env: MEILI_HTTP_ADDR=0.0.0.0:7700] [default: localhost:7700]
--http-payload-size-limit <HTTP_PAYLOAD_SIZE_LIMIT>
Sets the maximum size of accepted payloads. Value must be given in bytes or explicitly stating a base unit (for instance: 107374182400, '107.7Gb', or '107374 Mb') [env: MEILI_HTTP_PAYLOAD_SIZE_LIMIT=] [default: 100000000]
--ignore-dump-if-db-exists
Prevents a Meilisearch instance with an existing database from throwing an error when using `--import-dump`. Instead, the dump will be ignored and Meilisearch will launch using the existing database [env: MEILI_IGNORE_DUMP_IF_DB_EXISTS=]
--ignore-missing-dump
Prevents Meilisearch from throwing an error when `--import-dump` does not point to a valid dump file. Instead, Meilisearch will start normally without importing any dump [env: MEILI_IGNORE_MISSING_DUMP=]
--ignore-missing-snapshot
Prevents a Meilisearch instance from throwing an error when `--import-snapshot` does not point to a valid snapshot file [env: MEILI_IGNORE_MISSING_SNAPSHOT=]
--ignore-snapshot-if-db-exists
Prevents a Meilisearch instance with an existing database from throwing an error when using `--import-snapshot`. Instead, the snapshot will be ignored and Meilisearch will launch using the existing database [env: MEILI_IGNORE_SNAPSHOT_IF_DB_EXISTS=]
--import-dump <IMPORT_DUMP>
Imports the dump file located at the specified path. Path must point to a `.dump` file. If a database already exists, Meilisearch will throw an error and abort launch [env: MEILI_IMPORT_DUMP=]
--import-snapshot <IMPORT_SNAPSHOT>
Launches Meilisearch after importing a previously-generated snapshot at the given filepath [env: MEILI_IMPORT_SNAPSHOT=]
--log-level <LOG_LEVEL>
Defines how much detail should be present in Meilisearch's logs [env: MEILI_LOG_LEVEL=] [default: INFO]
--master-key <MASTER_KEY>
Sets the instance's master key, automatically protecting all routes except `GET /health` [env: MEILI_MASTER_KEY=meilisearch-api]
--max-indexing-memory <MAX_INDEXING_MEMORY>
Sets the maximum amount of RAM Meilisearch can use when indexing. By default, Meilisearch uses no more than two thirds of available memory [env: MEILI_MAX_INDEXING_MEMORY=] [default: "10.373163858428597 GiB"]
--max-indexing-threads <MAX_INDEXING_THREADS>
Sets the maximum number of threads Meilisearch can use during indexation. By default, the indexer avoids using more than half of a machine's total processing units. This ensures Meilisearch is always ready to perform searches, even while you are updating an index [env: MEILI_MAX_INDEXING_THREADS=] [default: 6]
--no-analytics
Deactivates Meilisearch's built-in telemetry when provided [env: MEILI_NO_ANALYTICS=]
--schedule-snapshot [<SNAPSHOT_INTERVAL_SEC>]
Activates scheduled snapshots when provided. Snapshots are disabled by default [env: MEILI_SCHEDULE_SNAPSHOT=] [default: ]
--snapshot-dir <SNAPSHOT_DIR>
Sets the directory where Meilisearch will store snapshots [env: MEILI_SNAPSHOT_DIR=] [default: snapshots/]
--ssl-auth-path <SSL_AUTH_PATH>
Enables client authentication in the specified path [env: MEILI_SSL_AUTH_PATH=]
--ssl-cert-path <SSL_CERT_PATH>
Sets the server's SSL certificates [env: MEILI_SSL_CERT_PATH=]
--ssl-key-path <SSL_KEY_PATH>
Sets the server's SSL key files [env: MEILI_SSL_KEY_PATH=]
--ssl-ocsp-path <SSL_OCSP_PATH>
Sets the server's OCSP file. *Optional* [env: MEILI_SSL_OCSP_PATH=]
--ssl-require-auth
Makes SSL authentication mandatory [env: MEILI_SSL_REQUIRE_AUTH=]
--ssl-resumption
Activates SSL session resumption [env: MEILI_SSL_RESUMPTION=]
--ssl-tickets
Activates SSL tickets [env: MEILI_SSL_TICKETS=]
--task-webhook-authorization-header <TASK_WEBHOOK_AUTHORIZATION_HEADER>
The Authorization header to send on the webhook URL whenever a task finishes so a third party can be notified [env: MEILI_TASK_WEBHOOK_AUTHORIZATION_HEADER=]
--task-webhook-url <TASK_WEBHOOK_URL>
Called whenever a task finishes so a third party can be notified [env: MEILI_TASK_WEBHOOK_URL=]
-V, --version
Print version
通过这些简单的英文,大致也能猜出意思来
这里要特殊说明一下,以数据存放位置 --db-path
为例
在部署时命令为
/bin/meilisearch ---db-path=/home/data/data.ms
meilisearch 还提供了参数项通过配置文件来体现,注意,只支持.toml文件,例如
新建一个config.toml
文件
db_path = "/home/data/data.ms"
启动服务
/bin/meilisearch --config-file-path=./config.toml
效果和上面的一样的
官网提供了完整的配置文件下载示例
curl https://raw.githubusercontent.com/meilisearch/meilisearch/latest/config.toml > config.toml
几个比较常用的命令,例如db-path
,import-dump
,config-file-path
,master_key
需要进一步运维的话可以关注一下配置块
功能介绍
我们把她当作文档数据库,索引也就对应的概念为库表,文档对应具体的记录
对于查询功能,不在花费过多时间去整理,网上自行查阅,这里列举几个常用的功能点说明
索引的创建
分为显式和隐式,直接插入数据到一个指定索引库,会根据实际数据新建索引库并插入记录主键ID
索引库必须有一个主键属性,每条文档都会有一个唯一id,新建库时可以指定指定字段
如果没有主动指定, Meilisearch 从您的数据集中推断出一个字段作为唯一标识
当然,后续也可以更行主键ID字段
异步任务,添加文档、修改、新建索引,这类操作为异步操作,应对密集计算服务
数据导出、导入、迁移等服务
支持搜索驱动嵌入,例如,你可以将向量搜索引入进来,只需要配置llm厂商的key,就可以实现向量化搜索,推荐用openai的向量搜索,1百万token才3美分
提供了权限、临时权限管理
查询
- 基本搜索,这个就是最普通的全文搜索了,注意哈,当你搜索
Americane
时,e
是你不小心打错的字符,American
也能搜出来,因为她支持错别字纠正 - 过滤器,支持过滤字段,例如某几个字段 where 条件查询
- 支持查询结果升降序排序
- 支持分页,没错,offset limit
- 支持前缀搜索,例如mysql中的
like
,比如搜索 "mat" 也能匹配 "matrix" - 同义词搜索,假如你搜索
phone
时,你可以设置同样搜索[iphone, apple phone]
,省的你用别名去查询多次 - 可搜索字段控制,你可以指定索引库哪些字段可被搜索
- 属性裁剪,
- 假如你要搜索的是关于小说的索引,十万字的小说存储在
content
字段中,返回内容将非常庞大,你可以在搜索时指定该字段返回显示多少个字 - 指定字段返回,也就相当于
select *
向select id,name,sex
转变 - 高亮显示,你可以让搜索内容中的搜索被搜索内容高亮显示,例如
中华<em>人民</em>共和国万岁
,其中人民
两个字是标签包裹起来的,然后配合前端区高亮显示,并且这个标签是可以自定义的,而且支持多个字段高亮显示,例如真实记录中的title
desc
常用到的查询参数
参数 | 说明 |
---|---|
q | 查询关键词 |
filter | 过滤条件 |
sort | 排序字段 |
limit | 返回结果上限 |
offset | 跳过的结果数 |
attributesToRetrieve | 指定返回字段 |
attributesToHighlight | 高亮字段 |
attributesToCrop | 裁剪字段 |
cropLength | 裁剪长度 |
当然,官方还提供了很多更丰富的查询方式,以下为官方文档
https://meilisearch.org.cn/docs/home