Asset Bank is able to index the text content of certain types of asset file. This means that users can search for keywords within these files. By default, indexing is disabled for all file types. It can be enabled by running database statements as explained here: enable text indexing for text based files (PDF, Word and plain text files) and enable text indexing for other supported files (DOCX, PPT & PPTX).
There is a setting which controls how much text is extracted from asset files and made searchable. By default the entire file is indexed but you may wish to limit this to a certain number of characters to reduce the database and index size (especially if you have very large files). Change the setting asset-file-to-text-converter-max-characters to specify the number of characters that should be indexed.
The extracted text is cached in the database as well as being stored in the index to improve performance. You should not normally need to clear this cache but if you are troubleshooting problems with file indexing or have been instructed to do so by Asset Bank support please run the following SQL statement to clear the database cache:
DELETE * FROM AssetFileTextContent;
Then perform a reindex of assets so that text is re-extracted and added to the index. This reindex may take longer than usual as text will be extracted from all assets files for which indexing is enabled. Please note that in previous versions of Asset Bank the extracted text was not cached in the database. Therefore, if the AssetFileTextContent table doesn't exist in your version of Asset Bank then a reindex will re-extract the asset text.
Comments
0 comments
Please sign in to leave a comment.