Uploaded image for project: 'Planet4'
  1. Planet4
  2. PLANET-4987

Prevent sync of non-pdf attachments to ElasticSearch


    • Icon: Task Task
    • Resolution: Released
    • Icon: Should have Should have
    • 2.34.2
    • None
    • 2
    • Search
    • Sprint #132, Sprint #133, Sprint #134, Sprint #135, Sprint #136, Sprint #137, Sprint #138, Sprint #139
    • atlas

      Currently we already filter out all attachments that are not a pdf in the search query. However these are still synced to the ES cluster by the ElasticPress plugin, even though they are never used. This increases the size of the ES index which affects performance of queries. This size issue is aggravated by the fact that attachments sometimes have a lot of duplicate garbage data.

      Probably worse is that it slows down ES syncing, which we want to avoid as during a manual sync ES is unavailable and the search falls back to mysql search, which is slow and breaks a couple of things.


      • Investigate if the above type of files can be skipped
      • Check how ElasticSearch plugin is handling other mime types
      • If its an easy solution that fits in the estimated time please implement, otherwise open follow up ticket

            pvincent Pieter Vincent
            pvincent Pieter Vincent
            0 Vote for this issue
            0 Start watching this issue