博客 / 詳情

返回

elasticsearch的script之script_fields,以及doc_values和source

1. doc與params._source

script中有時候用doc, 有時候用params._source, 是不是不容易不清楚?
script中直接使用可以用doc也可以用params._source;
只是用法不太一樣:
doc用的時候是一個包裝器, 要.value才能操作;params._source是直接取source原始數據,不用.value

但是如果是要寫source原始內容(比如_update_by_query裏用)

必用source(如: ctx._source['field']='value')

那麼我們doc[xxx]或 prarams._source用的到底是什麼?

1.1 doc_values和source數據到底是什麼

要點:

1.1.1 doc_values

在 Painless 腳本中,使用 doc['field_name'] 訪問的就是字段的 doc_values;

什麼是doc_values?

doc_values是字段的"正排索引",索引時創建,默認情形下每個字段的doc_values都是被激活的(除了text類型: 因為text類型字段會被分詞, 所以沒有, 但一般都keyword多字段);

1.1.2 source數據

使用 params._source 訪問的就是source原始數據

小結: 所以容易理解:

如果要讀取數據可以使用doc['xxx'], 訪問的是正排索引;
要訪問source原數據, 就用 params._source;

2.實例: script_fields中的script

2.1 es中數據準備和查詢寫法

2.1.1 es中的test_test001索引數據:

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test_test001",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "zhangsan4",
          "birth": "0",
          "age": 20
        }
      },
      {
        "_index": "test_test001",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "zhangsan2",
          "birth": 1763308800000,
          "age": 23
        }
      }
    ]
  }
}

2.1.2 查詢時使用 script_fields

GET test_test001/_search
{
  "_source": ["name", "age", "birth"],  // 指定返回的源字段
  "query": {
    "match_all": {}
  },
  "script_fields": {
    "nextYearAge": {
      "script": {
        "lang": "painless",
        "source": "doc['age'].value + params.num",
        "params": {
          "num": 1
        }
      }
    },
    "nextYearAge1": {
      "script": {
        "lang": "painless",
        "source": "doc.age.value + params.num",
        "params": {
          "num": 1
        }
      }
    },
    "nextYearAge2": {
      "script": "params._source.age + 1 "
    },
    "nextYearAge3": {
      "script": "params._source['age'] + 1 "
    },
    "nameLength": {
      "script": {
        "lang": "painless",
        "source": "doc['name.keyword'].value.length()"
        // error: "source": "doc['name'].value.length()"
      }
    },
    "nameLength2": {
      "script": "params._source.name.length()"
    }
  }
}

都是可以的:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test_test001",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "zhangsan4",
          "birth": "0",
          "age": 20
        },
        "fields": {
          "nameLength": [
            9
          ],
          "nextYearAge": [
            21
          ],
          "nameLength2": [
            9
          ],
          "nextYearAge1": [
            21
          ],
          "nextYearAge2": [
            21
          ],
          "nextYearAge3": [
            21
          ]
        }
      },
      {
        "_index": "test_test001",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "zhangsan2",
          "birth": 1763308800000,
          "age": 23
        },
        "fields": {
          "nameLength": [
            9
          ],
          "nextYearAge": [
            24
          ],
          "nameLength2": [
            9
          ],
          "nextYearAge1": [
            24
          ],
          "nextYearAge2": [
            24
          ],
          "nextYearAge3": [
            24
          ]
        }
      }
    ]
  }
}

2.2 script_fields的字段訪問

2.2.1 用法1:script->source 裏:

  • case1: doc['fieldname'].value

    "nextYearAge": {
    "script": {
      "lang": "painless",
      "source": "doc['age'].value + params.num",
      "params": {
        "num": 1
      }
    }
    }
  • case2: doc.fieldname.value

    "nextYearAge1": {
    "script": {
      "lang": "painless",
      "source": "doc.age.value + params.num",
      "params": {
        "num": 1
      }
    }
    }
  • case3: 當然這裏也可以寫成:

    "source": "doc.age.value + params['num']"

2.2.2 用法2: script 中直接使用:

  • case4: params._source.fieldname

    "nextYearAge2": {
    "script": "params._source.age + 1 "
    }
  • case5: params._source['fieldname']
"nextYearAge3": {
  "script": "params._source['age'] + 1 "
},

3. doc_values

3.1 text類型不支持 doc_values

因為 text 字段會被分詞,不適合用於排序和聚合。

3.2 什麼是 doc_values

注意:text 字段不能用於排序、聚合和腳本中的 doc[] 訪問,因為 text 字段默認沒有 doc_values。但是,text 字段可以有一個多字段(multi-field)是 keyword 類型,該子字段可以啓用 doc_values。

如上面的 script_fields中:

"nameLength": {
  "script": {
    "lang": "painless",
    "source": "doc['name.keyword'].value.length()"
    // error: "source": "doc['name'].value.length()"
  }
}
user avatar
0 位用戶收藏了這個故事!

發佈 評論

Some HTML is okay.