WEBVTT NOTE duration:"00:15:24" NOTE recognizability:0.864 NOTE language:en-us NOTE Confidence: 0.875335310833333 440d4c8d-8abf-43e9-bd03-979c9e02df67 00:00:00.000 --> 00:00:02.745 Welcome. This training is provided NOTE Confidence: 0.875335310833333 0eafd741-ed09-410e-a509-82fe7fc8ae33 00:00:02.745 --> 00:00:06.379 by the search team at the EPA. NOTE Confidence: 0.875335310833333 c0dde934-0896-4d96-9ba1-b329b0ba1697 00:00:06.380 --> 00:00:08.065 We're the team that provides NOTE Confidence: 0.875335310833333 f90c1dd4-1623-4b18-ade3-cbaf480918a6 00:00:08.065 --> 00:00:09.750 the search capabilities for the NOTE Confidence: 0.875335310833333 6dd900c9-6663-4a8a-95e2-5be3cb52ea96 00:00:09.806 --> 00:00:11.656 public and internal EPA websites. NOTE Confidence: 0.875335310833333 0e04c55e-7dfb-497a-8b4e-28cb59bc9560 00:00:11.660 --> 00:00:13.684 We're here to help you make the content NOTE Confidence: 0.875335310833333 9b4bc8e1-f8fd-4e53-b39e-f66471f77f5a 00:00:13.684 --> 00:00:15.130 you publish searchable and findable NOTE Confidence: 0.875335310833333 91a5b437-3bce-4f7f-a1a4-465103267edf 00:00:15.130 --> 00:00:17.209 on the intranet and public search at NOTE Confidence: 0.875335310833333 e52c2041-7460-4de1-ac09-3f6eaee1b23d 00:00:17.265 --> 00:00:19.145 the EPA with as much ease as possible. NOTE Confidence: 0.873291086315789 591d3a09-a903-4842-9975-5ea9f030e322 00:00:21.350 --> 00:00:22.802 A little background. NOTE Confidence: 0.873291086315789 e4c3abea-d8fc-4b5b-bf78-85bf12263ad0 00:00:22.802 --> 00:00:25.222 The current search engine lucidworks NOTE Confidence: 0.873291086315789 68335533-b786-494d-b122-068bf6eda6e7 00:00:25.222 --> 00:00:28.134 fusion was implemented in 2018 to NOTE Confidence: 0.873291086315789 fb657e37-0f15-4325-b372-8f8f1bf4c057 00:00:28.134 --> 00:00:30.469 replace the Google search appliance. NOTE Confidence: 0.873291086315789 d7bfdcb0-03a8-4056-a60a-158d29c22a42 00:00:30.470 --> 00:00:32.390 Lucidworks Fusion is a search NOTE Confidence: 0.873291086315789 bf43cd65-9fff-4a34-be49-28c1c84ced27 00:00:32.390 --> 00:00:34.310 and analytics platform built on NOTE Confidence: 0.873291086315789 99d356d5-9b1d-4b6c-81b0-ef08d24f1260 00:00:34.379 --> 00:00:36.455 top of Apache SOLR. SOLR powers NOTE Confidence: 0.873291086315789 d426a88f-ee0b-47b5-a56c-5f8deb590092 00:00:36.460 --> 00:00:38.695 the search. Fusion provides easily NOTE Confidence: 0.873291086315789 8ee1f3f5-6d8c-4fa4-8a3d-f2171435736c 00:00:38.695 --> 00:00:40.483 deployable search enhancement tools NOTE Confidence: 0.873291086315789 c0ad915b-e606-49e1-816b-a4f578ef756a 00:00:40.483 --> 00:00:42.639 such as different connectors to NOTE Confidence: 0.873291086315789 5af04aa2-e5de-4445-a5b0-329aa2e397f5 00:00:42.639 --> 00:00:44.689 easily index different data types, NOTE Confidence: 0.873291086315789 43bc619f-3a99-4143-9493-93a3f2ce336a 00:00:44.690 --> 00:00:46.205 customizable boosting scheme, NOTE Confidence: 0.873291086315789 8bd00387-9193-4a31-bfc2-31a153a17f0e 00:00:46.205 --> 00:00:49.235 click signals boosting, and AI powered NOTE Confidence: 0.873291086315789 44673e2f-faf3-415c-be2d-4268fcff2b8c 00:00:49.235 --> 00:00:51.720 search relevance improvement tools. NOTE Confidence: 0.873291086315789 291daf1e-e897-4b7c-be1a-340b063bbfee 00:00:51.720 --> 00:00:53.442 These are all great tools to help NOTE Confidence: 0.873291086315789 106dcf05-13f8-46c2-9fa8-70f4eaf21ec1 00:00:53.442 --> 00:00:55.239 users find what they're looking for, NOTE Confidence: 0.873291086315789 f68e807a-a23a-40bf-9693-248f2b96b290 00:00:55.240 --> 00:00:56.920 but in order for these tools NOTE Confidence: 0.873291086315789 f433058f-4477-40d1-80b1-1a967ebf76e2 00:00:56.920 --> 00:00:58.040 to work most effectively, NOTE Confidence: 0.873291086315789 42020b50-0f14-4719-a596-4f403fbf4581 00:00:58.040 --> 00:00:59.495 they need good quality search NOTE Confidence: 0.873291086315789 11951be6-c696-436f-ab6e-5c91ca038a56 00:00:59.495 --> 00:01:00.950 friendly content to work with. NOTE Confidence: 0.887767045625 0d81d872-8c44-4560-bb53-14e222ebebf6 00:01:03.960 --> 00:01:05.485 Let's take a quick technical NOTE Confidence: 0.887767045625 5c348948-59aa-4c43-ba38-21d43b7ca5ad 00:01:05.485 --> 00:01:07.453 look at the process of indexing NOTE Confidence: 0.887767045625 da2df3b3-9cca-4e26-821a-387cdd10e612 00:01:07.453 --> 00:01:09.348 and serving results to users. NOTE Confidence: 0.887767045625 6aa01020-a05d-4fca-919f-1aed1da0bf62 00:01:09.350 --> 00:01:11.470 First of all, when you run a search, NOTE Confidence: 0.887767045625 de0afaa2-33a4-44a6-879d-f01384d99a6e 00:01:11.470 --> 00:01:13.630 the search engine is not running NOTE Confidence: 0.887767045625 04e9f84e-518e-4008-8eb8-b57c388f83f9 00:01:13.630 --> 00:01:15.610 queries against several EPA servers. NOTE Confidence: 0.887767045625 7e1426d6-05ed-4d99-b624-24764978bc71 00:01:15.610 --> 00:01:17.040 The query is run against NOTE Confidence: 0.887767045625 08ad2c16-f1f7-48ce-85fd-c856b59a39bd 00:01:17.040 --> 00:01:18.910 what is in the search index. NOTE Confidence: 0.887767045625 beca0d46-f597-40d4-a437-7f88457f1203 00:01:18.910 --> 00:01:21.815 The index is made-up of pre-identified, NOTE Confidence: 0.887767045625 6ab236ac-15a6-4d9f-97bc-bc7f1d6a0a94 00:01:21.815 --> 00:01:23.711 search appropriate content from many NOTE Confidence: 0.887767045625 9ad2828b-87f7-40be-addf-b5bd6c4e8b53 00:01:23.711 --> 00:01:25.748 but not all servers at the EPA. NOTE Confidence: 0.887767045625 3584dbbc-f37d-4792-96b5-78a23947dedd 00:01:25.750 --> 00:01:27.528 So if you can't find a page NOTE Confidence: 0.887767045625 1b657bb1-6da7-47ac-b2ca-e260e13076d4 00:01:27.528 --> 00:01:28.290 in search results, NOTE Confidence: 0.887767045625 ff6faa25-7c7f-45e7-a4a7-fa35c26cef16 00:01:28.290 --> 00:01:30.882 it may be that the page is not indexed. NOTE Confidence: 0.887767045625 aee112ff-e3c2-4945-84f7-45b345722529 00:01:30.890 --> 00:01:32.280 If you believe something should NOTE Confidence: 0.887767045625 bf1bf5e8-0cae-48e3-8e84-13d3ebf9b35d 00:01:32.280 --> 00:01:34.268 be in the index and it is not, NOTE Confidence: 0.887767045625 ac33d375-055b-4207-9dad-61b699c9b261 00:01:34.270 --> 00:01:36.826 please reach out to us and let us know. NOTE Confidence: 0.887767045625 ea7775d5-ff6a-449a-b362-62bbfcc7160d 00:01:36.830 --> 00:01:38.490 Search contact information can be NOTE Confidence: 0.887767045625 f4c82867-4af7-4b86-af49-c4a4011af84b 00:01:38.490 --> 00:01:41.340 found at the end of this presentation. NOTE Confidence: 0.887767045625 692adc4e-092d-46fa-a4db-df1f0447831b 00:01:41.340 --> 00:01:42.968 Lucidworks Fusion provides different NOTE Confidence: 0.887767045625 94c12519-6a41-4f8a-89fe-e236e9a7635f 00:01:42.968 --> 00:01:45.003 connectors that are developed to NOTE Confidence: 0.887767045625 33d30d5c-5906-4fdb-98c9-7a60fbf4df44 00:01:45.003 --> 00:01:46.898 process different kinds of content. NOTE Confidence: 0.887767045625 fd3d670e-5de7-4b9b-b120-54d7b2b3ec35 00:01:46.900 --> 00:01:49.288 Right now at the EPA there NOTE Confidence: 0.887767045625 d74469ef-7708-4926-97f1-19359bc61fbd 00:01:49.288 --> 00:01:51.280 are three connectors we use. NOTE Confidence: 0.887767045625 b3914b74-067d-4f41-aa77-322335e2956e 00:01:51.280 --> 00:01:53.324 We use a web connector and crawl NOTE Confidence: 0.887767045625 749556de-2769-4277-9fb4-3ba3a3bc948a 00:01:53.324 --> 00:01:55.159 content for most of our servers. NOTE Confidence: 0.887767045625 f42670a8-3bc8-40ad-8a64-95ab3917cfa8 00:01:55.160 --> 00:01:57.631 This means we define URLs from which NOTE Confidence: 0.887767045625 4b93b83b-98f0-4095-9c66-c4a08de5c25e 00:01:57.631 --> 00:01:59.524 a crawler starts crawling, following NOTE Confidence: 0.887767045625 8ccff6f5-c4fc-43a1-8cf1-b17483dffe78 00:01:59.524 --> 00:02:02.184 links on a page to discover content NOTE Confidence: 0.887767045625 05753d68-75eb-4e34-897d-2530c472e4bc 00:02:02.184 --> 00:02:04.796 to index these start URLs or seeds NOTE Confidence: 0.887767045625 aaf73b2b-7200-47a5-adf3-34a4ead616bc 00:02:04.796 --> 00:02:06.958 are usually home pages because home NOTE Confidence: 0.887767045625 5473c1fb-3947-41c0-a4aa-44f0ff7eadbf 00:02:06.958 --> 00:02:09.471 pages have menus and links to the NOTE Confidence: 0.887767045625 9d159934-d0bb-416a-9f6c-940bf15b1889 00:02:09.471 --> 00:02:11.876 most important pages in the site. NOTE Confidence: 0.887767045625 697b6c33-26ba-45d1-a02d-495d6cc47c55 00:02:11.880 --> 00:02:15.177 We use this special connector to ingest NOTE Confidence: 0.887767045625 cd4fce8a-e9be-4973-9201-2ccdb6fde0bd 00:02:15.177 --> 00:02:18.270 SharePoint content - the SharePoint connector. NOTE Confidence: 0.887767045625 ac845c0e-1c03-4110-81df-d91481c58057 00:02:18.270 --> 00:02:21.910 And finally, we use a feed to ingest NOTE Confidence: 0.887767045625 2dad9bdc-12bc-4d3e-add1-f304b17f9109 00:02:21.910 --> 00:02:24.170 content management system content. NOTE Confidence: 0.887767045625 1feef563-6645-47fa-86bf-c2f15b239e9e 00:02:24.170 --> 00:02:26.690 The connector is directly connected with a NOTE Confidence: 0.887767045625 bf606e95-803c-4672-b04f-ffa4360a3c6a 00:02:26.690 --> 00:02:29.270 data source. This is a lucidworks concept. NOTE Confidence: 0.887767045625 769b9c91-ba75-4bcf-a7fc-a75f18419c19 00:02:29.270 --> 00:02:31.118 Data sources enable us to configure NOTE Confidence: 0.887767045625 27b32dfd-bcb3-4919-a130-62eadef39d0b 00:02:31.118 --> 00:02:33.250 how we ingest an index content, NOTE Confidence: 0.887767045625 c6bfb2f6-845a-4819-8ba7-f9a636012aee 00:02:33.250 --> 00:02:35.310 which URL patterns to follow, NOTE Confidence: 0.887767045625 b37a141d-f432-48e7-be2e-465353023e6c 00:02:35.310 --> 00:02:37.070 which patterns not to follow, NOTE Confidence: 0.887767045625 38fc6656-6f3e-40db-b77a-a6dc8500caa4 00:02:37.070 --> 00:02:39.289 which file types to include and exclude, NOTE Confidence: 0.887767045625 049f3436-0b81-4052-b609-6b05f1455078 00:02:39.290 --> 00:02:41.230 how often we refresh content, NOTE Confidence: 0.887767045625 5c3abd65-89c4-4bc9-84e9-3f30683006d5 00:02:41.230 --> 00:02:42.262 the crawl rate, NOTE Confidence: 0.887767045625 e6826786-81de-4e6f-b34e-eefca23b74f7 00:02:42.262 --> 00:02:44.670 what URLs and metadata fields should be NOTE Confidence: 0.887767045625 036283e4-c31d-47d0-9c60-f4e54ac47ad0 00:02:44.735 --> 00:02:47.045 included, and what should be excluded. NOTE Confidence: 0.887767045625 b5026489-38a7-49b1-b243-8102c40d45bd 00:02:47.050 --> 00:02:49.612 We index only web pages and PDF NOTE Confidence: 0.887767045625 4f0993c8-eab2-47a4-bcf5-feaf10c127ad 00:02:49.612 --> 00:02:50.710 on public search. NOTE Confidence: 0.887767045625 9d402a7d-e991-4503-8b71-c2cf55e9fde8 00:02:50.710 --> 00:02:54.478 On Intranet search, we index web pages, PDF, NOTE Confidence: 0.887767045625 ad9c47bd-f0ca-4d13-a7fc-9f0e5897d27e 00:02:54.480 --> 00:02:56.405 and doc and docx file types. NOTE Confidence: 0.798244546956522 1b94156a-7948-4d8d-80a3-a59ceb931328 00:02:58.630 --> 00:03:00.478 What makes a site easy to crawl? NOTE Confidence: 0.798244546956522 4c9657ee-fc5b-4a46-87fa-b10c89e79449 00:03:00.478 --> 00:03:01.989 Content that is properly and NOTE Confidence: 0.798244546956522 db5c3080-2e10-41d5-8595-79c960b910e9 00:03:01.989 --> 00:03:03.624 thoroughly interlinked on a site NOTE Confidence: 0.798244546956522 d5493341-3e4e-4c92-9e9e-8423d785313d 00:03:03.624 --> 00:03:05.450 makes a site easy to crawl. NOTE Confidence: 0.798244546956522 8ba709d3-7950-4524-b919-3a04a3923cf0 00:03:05.450 --> 00:03:07.262 Proper linking means that all pages NOTE Confidence: 0.798244546956522 b7d3c898-87da-4901-846e-2624308b5d4d 00:03:07.262 --> 00:03:09.149 can be reached through a site menu, NOTE Confidence: 0.798244546956522 1c3fb586-88c8-4e29-8b44-bf79b556d39f 00:03:09.150 --> 00:03:10.329 through site navigation, NOTE Confidence: 0.798244546956522 8e574948-2880-4545-afb4-33eeb3d6090a 00:03:10.329 --> 00:03:12.687 or through links on the pages. NOTE Confidence: 0.798244546956522 e6b72852-65cf-4e25-98f0-e646fd79b7ad 00:03:12.690 --> 00:03:14.475 If a document isn't sufficiently NOTE Confidence: 0.798244546956522 26310d25-9bc1-472e-8981-ced7644bee94 00:03:14.475 --> 00:03:15.903 interlinked within the site, NOTE Confidence: 0.798244546956522 d8f43464-cd60-4617-829a-54933ff4477d 00:03:15.910 --> 00:03:17.933 then the crawler will not discover it NOTE Confidence: 0.798244546956522 2809c941-6dfc-4bba-b4ba-c3427262c6e9 00:03:17.933 --> 00:03:20.509 and the content will not be included in the index. NOTE Confidence: 0.798244546956522 3447cfd3-5517-4bcd-85bf-517e0e029b1c 00:03:20.510 --> 00:03:22.790 When building a new content platform, NOTE Confidence: 0.798244546956522 8f941b90-1d1d-4a84-8484-7d4e40aec15c 00:03:22.790 --> 00:03:25.166 it's best to work with the search team to NOTE Confidence: 0.798244546956522 135f8228-7673-4681-b593-2a03d0d9f1d9 00:03:25.166 --> 00:03:27.450 make sure the site is crawler friendly. NOTE Confidence: 0.798244546956522 dcb775d3-7f99-4fba-9c06-bad5691c27cc 00:03:27.450 --> 00:03:28.620 We'll be happy to help. NOTE Confidence: 0.828527867142857 4fb0e841-761f-4817-9159-542a44de9143 00:03:31.000 --> 00:03:34.535 What makes the site difficult to crawl? NOTE Confidence: 0.828527867142857 9fcdeae2-fb43-4144-b297-d203c19a4406 00:03:34.540 --> 00:03:36.944 One thing is incomplete NOTE Confidence: 0.828527867142857 1e367483-7bf0-448e-a31e-585a670ccd22 00:03:36.944 --> 00:03:39.348 interlinking, and data islands. NOTE Confidence: 0.828527867142857 ee16d8f2-229f-4179-8365-d46434d5ce97 00:03:39.350 --> 00:03:41.870 Pages and groups of pages that have no NOTE Confidence: 0.828527867142857 7d94aa19-b3c5-4b30-9ce7-c98ff3d92bab 00:03:41.870 --> 00:03:43.990 incoming links are called data islands. NOTE Confidence: 0.828527867142857 84ccddcf-8e63-4502-8ba8-1780db01bc51 00:03:43.990 --> 00:03:46.250 A common type is a PDF that can only be NOTE Confidence: 0.828527867142857 315a893f-fb03-4a1d-80a2-7fc3ee6cfc2d 00:03:46.317 --> 00:03:48.465 found through a database search form. NOTE Confidence: 0.828527867142857 a930b83c-dc5d-447b-9ebf-98677e666ecd 00:03:48.470 --> 00:03:50.972 These islands are not discoverable by NOTE Confidence: 0.828527867142857 00a13dae-eae0-4521-aeb2-4c3632d7e190 00:03:50.972 --> 00:03:53.598 any crawler unless the search developers NOTE Confidence: 0.828527867142857 45dab333-b104-4de1-ace2-e086ea9d6ee0 00:03:53.598 --> 00:03:56.706 know where the islands are, and designate NOTE Confidence: 0.828527867142857 fbb207fd-f5d8-4522-bf36-2c59e7660fef 00:03:56.706 --> 00:03:59.656 those URLs as start URLs for the crawler. NOTE Confidence: 0.828527867142857 8af840a8-5051-4b58-8025-fb49a11729d2 00:03:59.660 --> 00:04:01.805 Another thing that makes sites NOTE Confidence: 0.828527867142857 02ab9361-7cee-4bd4-a0b1-7ced0b67a93f 00:04:01.805 --> 00:04:04.380 difficult to crawl are dynamic URLs. NOTE Confidence: 0.828527867142857 d94c12a0-79d7-4e9c-b3d7-c2fdfa1e5d24 00:04:04.380 --> 00:04:07.500 Dynamic URLs present 2 problems, NOTE Confidence: 0.828527867142857 1eaf7880-478a-4997-84fc-a61b5e6ad236 00:04:07.500 --> 00:04:10.230 often a dynamic URL can have multiple NOTE Confidence: 0.828527867142857 89a218af-a42a-4026-b36c-b1484f5807c2 00:04:10.230 --> 00:04:12.259 options in parameters in the URLs, NOTE Confidence: 0.828527867142857 311f3ceb-547c-4842-b65a-ed5a4927bc61 00:04:12.260 --> 00:04:14.360 such as a sort parameter. NOTE Confidence: 0.828527867142857 cbd4087a-1955-4ad3-8269-e0088c804302 00:04:14.360 --> 00:04:16.593 We only want to return 1 presentation NOTE Confidence: 0.828527867142857 985948e4-2c91-46cc-a471-47ba6486bc19 00:04:16.593 --> 00:04:18.997 of a document in the search results. NOTE Confidence: 0.828527867142857 a2d7ef1a-bc45-4610-969e-4d3f4168f6d5 00:04:19.000 --> 00:04:21.232 So we typically handle this by NOTE Confidence: 0.828527867142857 43a9b288-d93b-4ffb-8a11-3215e1eb4aec 00:04:21.232 --> 00:04:23.124 including only one pattern like NOTE Confidence: 0.828527867142857 948aaafc-b7f4-4b1e-9953-71b7ca409cfb 00:04:23.124 --> 00:04:25.134 URLs that include sort by date. NOTE Confidence: 0.828527867142857 3c8826fa-7638-4d9c-8f1a-6d6ac75432c1 00:04:25.140 --> 00:04:27.716 But the harder problem is when the URLs NOTE Confidence: 0.828527867142857 ef0fc6c9-3486-42e0-b6b3-2d63bf5153de 00:04:27.716 --> 00:04:29.747 contain parameters that are hard to decode. NOTE Confidence: 0.828527867142857 d27bf943-3e7a-4916-8380-dccfb49761de 00:04:29.750 --> 00:04:32.798 Like hex strings. NOTE Confidence: 0.828527867142857 39d9615d-9d1b-4197-bd4d-47219876ccbe 00:04:32.800 --> 00:04:34.676 JavaScript single page applications NOTE Confidence: 0.828527867142857 59e17755-ebe9-4859-b57b-3aa5222c8263 00:04:34.676 --> 00:04:36.552 also present crawling challenges NOTE Confidence: 0.828527867142857 ce57e0b1-ddde-493d-8d2a-4584b6192ec9 00:04:36.552 --> 00:04:39.163 for many crawlers that do not NOTE Confidence: 0.828527867142857 325335e4-d42e-4229-86c3-239a4efb4775 00:04:39.163 --> 00:04:40.775 have robust JavaScript handling. NOTE Confidence: 0.828527867142857 f9fd0d6f-5f45-4ace-83df-94ac50d65742 00:04:40.780 --> 00:04:42.256 If any of these things are NOTE Confidence: 0.828527867142857 7ef23b5b-f648-4ba6-bb71-cf0d038acb88 00:04:42.256 --> 00:04:43.240 true for your site, NOTE Confidence: 0.828527867142857 bac6f961-aecd-4c7e-8098-c531ee3a4cc3 00:04:43.240 --> 00:04:45.322 it's even more important to involve NOTE Confidence: 0.828527867142857 e81be5a3-d072-4193-9efe-e07be3717878 00:04:45.322 --> 00:04:47.619 the search team in your project. NOTE Confidence: 0.828527867142857 256100e6-7d9c-40be-b490-d9c0655933b6 00:04:47.620 --> 00:04:50.977 Site maps. The easiest way to make a crawler NOTE Confidence: 0.828527867142857 fef834f9-21b8-451c-ab38-5edc072613b0 00:04:50.977 --> 00:04:53.437 friendly site is by using a site map. NOTE Confidence: 0.828527867142857 1537ddd2-a69c-4f14-8ac6-61eea4cff55c 00:04:53.440 --> 00:04:55.180 On the left is an example. NOTE Confidence: 0.828527867142857 350e7c4a-81ac-449e-b78c-6108ec826c50 00:04:55.180 --> 00:04:57.511 A site map is an XML document NOTE Confidence: 0.828527867142857 cf5a4a0c-7829-495d-a6e0-1f15d2fc6696 00:04:57.511 --> 00:04:59.598 that lists every URL in a site. NOTE Confidence: 0.828527867142857 1cca7e8a-4edb-4143-b6e6-4b2d22ac647a 00:04:59.600 --> 00:05:01.490 Sitemaps are created by the team NOTE Confidence: 0.828527867142857 a5dd99b3-cce0-4378-8f36-b822d0f9a619 00:05:01.490 --> 00:05:03.684 that owns the content. And this may NOTE Confidence: 0.828527867142857 9265eb9f-a22d-404d-9b75-d94356a8775f 00:05:03.684 --> 00:05:05.721 already be a feature of your platform. NOTE Confidence: 0.828527867142857 940fcee1-510f-42c5-9010-8b07ca14752d 00:05:05.730 --> 00:05:07.942 The crawler will use the site map NOTE Confidence: 0.828527867142857 c04cdd1d-c47a-4946-a582-feb5b3f29448 00:05:07.942 --> 00:05:09.982 to discover content to index rather NOTE Confidence: 0.828527867142857 a12245d9-fb45-46ec-a004-9a22ef154baa 00:05:09.982 --> 00:05:12.046 than following links on the website. NOTE Confidence: 0.828527867142857 cac7b7b5-0b48-4fc0-965c-a49a9a36d9e5 00:05:12.050 --> 00:05:12.348 Similarly, NOTE Confidence: 0.828527867142857 d9fbe407-b733-4f08-a3e4-043b8d002430 00:05:12.348 --> 00:05:14.732 a jump file is an HTML list of NOTE Confidence: 0.828527867142857 3848f234-152a-481e-b128-80137d640c7f 00:05:14.732 --> 00:05:16.711 all the content that should be NOTE Confidence: 0.828527867142857 7bb2b5e7-f39a-4b53-986c-65d32fb4804b 00:05:16.711 --> 00:05:18.874 indexed in a site. On the right, NOTE Confidence: 0.828527867142857 a094d4af-fb67-4b9b-a1ad-f2e0acec93fb 00:05:18.874 --> 00:05:20.890 an A to Z listing of all pages NOTE Confidence: 0.828527867142857 5cabaa5f-00d0-4fde-bad8-893ecf0710e5 00:05:20.962 --> 00:05:22.761 on your site could also be used NOTE Confidence: 0.828527867142857 bacda24f-e177-494c-9f04-141e49edd601 00:05:22.761 --> 00:05:24.728 as a jump file, or jump page. NOTE Confidence: 0.828527867142857 f97d2560-3d9b-4205-8d05-f2aee73b638a 00:05:24.730 --> 00:05:27.124 Site maps are preferable to jump files, NOTE Confidence: 0.828527867142857 c7ba182b-c1fc-4910-be9a-905c866fa2af 00:05:27.130 --> 00:05:29.182 but if you already have jump NOTE Confidence: 0.828527867142857 c38f3f17-c204-4374-ae0e-cccee63e17f5 00:05:29.182 --> 00:05:30.970 files, they work just fine. NOTE Confidence: 0.828527867142857 716d57ae-01e8-4530-a2dc-4d03ca78b8c2 00:05:30.970 --> 00:05:32.284 Now let's take a look at NOTE Confidence: 0.828527867142857 c3f9c0f3-2e7b-4bf7-8343-863ab0c2a8b9 00:05:32.284 --> 00:05:33.160 search and the results. NOTE Confidence: 0.859763091875 f14386a5-14ce-43aa-bc89-b399b691920d 00:05:36.240 --> 00:05:37.380 Full text search. NOTE Confidence: 0.859763091875 02305a76-a59a-4a71-a9df-8b32a0b941c9 00:05:37.380 --> 00:05:39.660 Full text search is the default type NOTE Confidence: 0.859763091875 7e8d054e-5f9a-4354-aa58-13be02c4b6ad 00:05:39.660 --> 00:05:42.360 of search when searching at the EPA. NOTE Confidence: 0.859763091875 b50dd243-be37-4623-a616-39359b9e01d0 00:05:42.360 --> 00:05:44.220 This means search takes into account NOTE Confidence: 0.859763091875 63ab889a-4091-4140-8d36-34fb30fc6531 00:05:44.220 --> 00:05:46.219 the entire content of the document, NOTE Confidence: 0.859763091875 5976dbae-f2c6-48ac-a2ac-a834548a6735 00:05:46.220 --> 00:05:48.044 which is most everything between body NOTE Confidence: 0.859763091875 c89f089a-a274-4d1e-bfd7-165d62823020 00:05:48.044 --> 00:05:50.219 opening tag and the closing body tag, NOTE Confidence: 0.859763091875 d9d4c274-76d9-4f4b-92de-abe1556bd849 00:05:50.220 --> 00:05:51.176 including metadata. NOTE Confidence: 0.859763091875 8cb6b05a-fd7e-4d5c-b8ff-d3cbc5823929 00:05:51.176 --> 00:05:54.085 We do not include headers, sidebars, NOTE Confidence: 0.859763091875 295e9319-8c33-4227-995c-8491aeecb899 00:05:54.085 --> 00:05:57.175 footers, or the HTML markup itself. NOTE Confidence: 0.81670276125 aad166d5-2060-4b09-85cf-0d3ada4f0988 00:05:59.340 --> 00:06:00.732 How are the documents NOTE Confidence: 0.81670276125 0938c400-56a5-4278-b02c-080fc66d6cf4 00:06:00.732 --> 00:06:02.124 ordered in search results? NOTE Confidence: 0.81670276125 fca5b29b-bd46-47fe-874f-d8419a2a366a 00:06:02.130 --> 00:06:03.810 The relevance score of each NOTE Confidence: 0.81670276125 04b778a9-c08e-4943-b2ed-bda9f2df2e49 00:06:03.810 --> 00:06:05.490 document determines the order in NOTE Confidence: 0.81670276125 a0eba235-430d-4789-ad79-2e0e45c047c0 00:06:05.555 --> 00:06:07.350 which documents will appear in NOTE Confidence: 0.81670276125 adc768db-1981-4111-be93-6b5f7cff0822 00:06:07.350 --> 00:06:09.145 search results, from most relevant NOTE Confidence: 0.81670276125 49ec7f39-0cc1-4958-8a2f-ffa879951426 00:06:09.205 --> 00:06:10.834 to least relevant. Relevance score NOTE Confidence: 0.81670276125 a25356da-5994-44e0-98de-9323e4ab5a07 00:06:10.834 --> 00:06:12.519 is a numerical value assigned NOTE Confidence: 0.81670276125 9846d11d-652f-4e88-b329-f664a6e699ab 00:06:12.519 --> 00:06:14.210 to a document that determines NOTE Confidence: 0.81670276125 d22202ce-14ca-4319-9c1c-05eaaf00c934 00:06:14.210 --> 00:06:16.380 how relevant it is to the query. NOTE Confidence: 0.81670276125 67726b70-ea04-41b6-8df9-47d274822d60 00:06:16.380 --> 00:06:18.140 The initial relevance score is NOTE Confidence: 0.81670276125 746d0e87-d541-4c81-b6b1-6998c428f077 00:06:18.140 --> 00:06:19.900 determined by the term frequency, NOTE Confidence: 0.81670276125 505efdac-e078-4a3a-8935-80f6441e0461 00:06:19.900 --> 00:06:21.526 inverse document frequency, NOTE Confidence: 0.81670276125 1fcf91df-68f2-438b-9f77-0d29d1535949 00:06:21.526 --> 00:06:24.236 algorithm. Term frequency (TF) is NOTE Confidence: 0.81670276125 9a5a041f-fdcf-4069-8388-6d182572afc8 00:06:24.236 --> 00:06:26.492 essentially how many times the NOTE Confidence: 0.81670276125 01323106-30a0-4058-bbd6-a2206c130a70 00:06:26.492 --> 00:06:28.688 query term is in the document. NOTE Confidence: 0.81670276125 fdc61d8c-5b37-40c7-bc58-3d3bd3a51434 00:06:28.690 --> 00:06:30.218 The inverse document frequency, NOTE Confidence: 0.81670276125 7defe572-0fa8-4fc8-bcf0-92c4fc4f488e 00:06:30.218 --> 00:06:32.510 or IDF, takes into account the NOTE Confidence: 0.81670276125 d9cea7cc-0fd1-4f41-86cc-fe2392476cd8 00:06:32.577 --> 00:06:34.697 frequency of terms across the NOTE Confidence: 0.81670276125 967cd1c4-9c35-4b2f-8ab9-c88cee002a35 00:06:34.697 --> 00:06:36.393 whole collection of documents. NOTE Confidence: 0.81670276125 1980c22f-40b4-4984-ab8a-8f5950c7e932 00:06:36.400 --> 00:06:38.035 Common terms across the collection NOTE Confidence: 0.81670276125 3de827fc-a73d-493e-8a52-f4553b1d3015 00:06:38.035 --> 00:06:40.075 have a lower score because they NOTE Confidence: 0.81670276125 3a5df6b9-be49-46f4-9588-53f7d1fe5ad5 00:06:40.075 --> 00:06:42.163 are less useful in determining what NOTE Confidence: 0.81670276125 901bda6f-df11-4953-8048-353b50bfa8a5 00:06:42.163 --> 00:06:43.590 individual documents are about. NOTE Confidence: 0.81670276125 cc660dc8-eda6-4c80-b184-a10149559a81 00:06:43.590 --> 00:06:44.571 Infrequent terms, NOTE Confidence: 0.81670276125 2687f568-2f89-48e6-8d61-6d746aa005ae 00:06:44.571 --> 00:06:46.533 have a higher score because they NOTE Confidence: 0.81670276125 f06dad71-3b82-4030-9424-2b2043ec37b5 00:06:46.533 --> 00:06:48.468 are more unique to a document. NOTE Confidence: 0.81670276125 bb84a0fd-a44d-4add-b450-9e614edaf6d2 00:06:48.470 --> 00:06:48.847 Again, NOTE Confidence: 0.81670276125 a9c45526-c51f-4445-9bbc-76d8610a6f62 00:06:48.847 --> 00:06:51.863 this is a base score. Boosts and some NOTE Confidence: 0.81670276125 072d0fc6-0167-4daa-a7a9-5990560ef476 00:06:51.863 --> 00:06:54.790 other things affect the score as well. NOTE Confidence: 0.81670276125 c101699e-0b0c-482f-b818-6d0792a066c6 00:06:54.790 --> 00:06:56.530 We'll talk more about boosts, NOTE Confidence: 0.81670276125 85de7343-94d4-4e99-be83-1ed2c0f803b7 00:06:56.530 --> 00:06:58.768 including click signals, in a moment. NOTE Confidence: 0.82993543 0a445744-8079-455b-82b6-444dc5453114 00:07:02.370 --> 00:07:04.477 What can content owners do to ensure NOTE Confidence: 0.82993543 7a9096b4-9dfe-4d20-9328-423620f83e2d 00:07:04.477 --> 00:07:06.628 their content is optimized for search? NOTE Confidence: 0.82993543 8d83817e-b8e4-4787-9b35-404a525953a9 00:07:06.630 --> 00:07:10.788 The answer is: write high quality content. NOTE Confidence: 0.82993543 a75418ba-bda3-43dc-a824-86c4bfbf5ad0 00:07:10.790 --> 00:07:12.194 The title is the first thing NOTE Confidence: 0.82993543 363cdf24-2cd7-4ffa-9832-e745dcf5c649 00:07:12.194 --> 00:07:13.730 a user sees on your page, NOTE Confidence: 0.82993543 1afd37af-1453-471f-97db-4b29d9994108 00:07:13.730 --> 00:07:15.949 and this is also in search results. NOTE Confidence: 0.82993543 0abd6d50-b4f6-4827-b2fe-9b1d66401b7c 00:07:15.950 --> 00:07:18.086 It should be accurate and concise. NOTE Confidence: 0.82993543 7810ca82-5fb7-4503-9c6a-216012c097b2 00:07:18.090 --> 00:07:20.352 For example, if your page is NOTE Confidence: 0.82993543 f80bd5f8-cf12-4131-aa58-61a8e894356b 00:07:20.352 --> 00:07:21.860 about anti harassment training, NOTE Confidence: 0.82993543 10192da5-bd1e-49ee-b2d5-b53e98528cdc 00:07:21.860 --> 00:07:23.350 then don't name it "training", NOTE Confidence: 0.82993543 c607a56b-af8c-434c-94cb-534a0ab2f6cf 00:07:23.350 --> 00:07:26.720 name it "anti harassment training". NOTE Confidence: 0.82993543 d7968744-0f91-40b9-a08b-6c49c3a92ea2 00:07:26.720 --> 00:07:29.780 Also, use descriptive link text. NOTE Confidence: 0.82993543 a1f22c2c-0cf6-47f6-a4a0-08b3c01d1914 00:07:29.780 --> 00:07:30.904 Don't name your links, NOTE Confidence: 0.82993543 3f15dafe-9b77-4dbb-a636-3e59d555243c 00:07:30.904 --> 00:07:32.590 just "link". Link text counts as NOTE Confidence: 0.82993543 33132eda-3a6c-4dda-a78f-ef8e46afd2a7 00:07:32.648 --> 00:07:34.377 part of the content of the page NOTE Confidence: 0.82993543 9a093a38-2b26-448b-b3d9-e04fab9687f5 00:07:34.380 --> 00:07:36.600 linked to. NOTE Confidence: 0.82993543 44810fa1-04f5-42cc-9180-82cb53618580 00:07:36.600 --> 00:07:38.765 Another tip is, write unique NOTE Confidence: 0.82993543 23ae7a40-c530-4094-8ea1-5529e8806feb 00:07:38.765 --> 00:07:40.497 metadata for each page. NOTE Confidence: 0.82993543 d985c21e-ee45-434a-a88d-93e098155ab5 00:07:40.500 --> 00:07:43.503 Don't use the same title or description NOTE Confidence: 0.82993543 f5d60e72-eb18-42c4-a2ee-371eb346f1db 00:07:43.503 --> 00:07:45.930 for several pages in a web area. NOTE Confidence: 0.82993543 21c0080d-cc0b-4f6b-be77-9307d67c1c3f 00:07:45.930 --> 00:07:47.701 Keep in mind the title and description NOTE Confidence: 0.82993543 80dadfba-c876-4bfb-b7a6-402d9b3827f4 00:07:47.701 --> 00:07:49.418 appear on the search results page NOTE Confidence: 0.82993543 34531eba-164f-4f14-8350-3a4f9830e8fd 00:07:49.418 --> 00:07:50.928 and helps users make decisions NOTE Confidence: 0.82993543 ab090af6-9662-4e01-b116-2f1e6a10b638 00:07:50.928 --> 00:07:52.439 about which pages are relevant NOTE Confidence: 0.82993543 71d2ca62-fb1f-4a9c-a402-8c1a9e3dce5d 00:07:52.439 --> 00:07:53.869 to what they're looking for. NOTE Confidence: 0.838183125384615 8290e7c9-ce21-4f17-a1bf-564c7cafcf52 00:07:55.890 --> 00:07:58.038 These and more best practices are NOTE Confidence: 0.838183125384615 b96b411b-55d7-44d0-96fe-ca650c5032fa 00:07:58.038 --> 00:08:00.869 found in the US EPA writing guide. NOTE Confidence: 0.838183125384615 6923c1c0-39e7-49ad-b932-05e4d682f3ab 00:08:00.870 --> 00:08:02.690 Please note, we will have a list NOTE Confidence: 0.838183125384615 f41ecca9-fbf4-4f1c-b8d1-92ec724c2911 00:08:02.690 --> 00:08:03.867 of referenced EPA documentation NOTE Confidence: 0.838183125384615 bb42a118-cfeb-41d8-a900-c2ebe2e0a76b 00:08:03.867 --> 00:08:05.877 at the end of the presentation. NOTE Confidence: 0.875381251071429 0c3c44d1-8a4c-444f-a0aa-38a196ecb1a9 00:08:08.690 --> 00:08:10.776 Another and important way to enhance the NOTE Confidence: 0.875381251071429 3ae31c11-0821-4357-98d1-5482b81b40cb 00:08:10.776 --> 00:08:13.116 quality of your content is to know how NOTE Confidence: 0.875381251071429 794f3a7f-c2ce-468b-85e7-bf17423fd0c7 00:08:13.116 --> 00:08:15.159 users search for your content, and then NOTE Confidence: 0.875381251071429 cf8943a7-9af7-4bde-9f83-7ad65d848e45 00:08:15.159 --> 00:08:16.959 use that language in your documents. NOTE Confidence: 0.875381251071429 f4b11f61-6258-4c11-b8b2-ca80a3daf0ef 00:08:16.960 --> 00:08:19.426 There are a few tools that can help you NOTE Confidence: 0.875381251071429 c9e54402-4ed0-4a53-9af0-64d21a4f3ae5 00:08:19.426 --> 00:08:21.628 understand how users search for your content. NOTE Confidence: 0.875381251071429 07b03053-8c28-4afd-94af-f16f3c268de0 00:08:21.630 --> 00:08:23.352 First, you can look at the query NOTE Confidence: 0.875381251071429 542e844e-1bb3-4d5d-a041-3ee9c1b9f4b1 00:08:23.352 --> 00:08:25.207 logs to see what terms are entered NOTE Confidence: 0.875381251071429 c84c25f6-c150-4aea-b07c-b0c5bf6c06e1 00:08:25.207 --> 00:08:27.423 in the search box at the EPA that NOTE Confidence: 0.875381251071429 17954c6b-7756-411c-9099-6a696ec0b372 00:08:27.423 --> 00:08:29.145 may be relevant to your content. NOTE Confidence: 0.875381251071429 776e17b7-56e1-4fe8-81de-c61195dd965e 00:08:29.150 --> 00:08:31.159 Second, you can look at the click NOTE Confidence: 0.875381251071429 6dff66df-094c-44e3-8042-86e476163599 00:08:31.159 --> 00:08:32.909 through reports in search central. NOTE Confidence: 0.875381251071429 d8bbf4f7-d966-4efe-af7b-526c794d5be8 00:08:32.910 --> 00:08:33.621 With this tool, NOTE Confidence: 0.875381251071429 9b01832e-1a53-433c-a4b7-2577b75b0985 00:08:33.621 --> 00:08:35.914 you can enter a URL of a specific page NOTE Confidence: 0.875381251071429 02f404dd-92ea-40b0-ba2b-42c32bda1ee4 00:08:35.914 --> 00:08:38.610 or a web area, and see which search terms NOTE Confidence: 0.875381251071429 0bfae6d3-3dd0-4fee-a6fa-8f75cc0596ab 00:08:38.610 --> 00:08:41.214 were used when that URL was clicked. NOTE Confidence: 0.875381251071429 ff660308-48fb-4151-9fda-4e673624b5b9 00:08:41.220 --> 00:08:43.074 Third, Google Analytics has a couple NOTE Confidence: 0.875381251071429 a745bebb-2b8a-4c29-9804-3e90541b288f 00:08:43.074 --> 00:08:44.914 different ways to see the terms NOTE Confidence: 0.875381251071429 44f96e35-3e50-4f39-9bfd-5b31a59472fa 00:08:44.914 --> 00:08:46.546 and concepts queried, that are similar NOTE Confidence: 0.875381251071429 58a17972-767c-4b0d-95bf-3147c114da2a 00:08:46.546 --> 00:08:48.477 to, or relevant to, your content, NOTE Confidence: 0.875381251071429 8bc54825-d554-44f5-bf58-bdc0b46a0763 00:08:48.480 --> 00:08:50.850 but also offers more specific information NOTE Confidence: 0.875381251071429 8f6c2e8b-dfdb-4832-a6fa-527851d7a14e 00:08:50.850 --> 00:08:53.240 about search exits and refinements. NOTE Confidence: 0.875381251071429 83de5bad-c3ce-40ed-97de-0945a50a9608 00:08:53.240 --> 00:08:54.602 Learn more on the web analytics NOTE Confidence: 0.875381251071429 00202d10-b063-4a5c-941a-8151a7c1bf64 00:08:54.602 --> 00:08:55.850 page in the Web guide. NOTE Confidence: 0.855384883666667 73361989-bca2-4e3c-9de2-dc2a3d766588 00:08:59.220 --> 00:09:00.150 Metadata in search. NOTE Confidence: 0.855384883666667 2fa6be2e-02db-4094-8b74-1dc5ce296853 00:09:00.150 --> 00:09:02.010 Good metadata is key to having NOTE Confidence: 0.855384883666667 adfcf966-b6e6-490b-a919-8b9fe233e553 00:09:02.010 --> 00:09:04.305 your pages perform well in search NOTE Confidence: 0.855384883666667 e1ad1e6f-4f5f-4c7f-a0c6-4725197db8d3 00:09:04.305 --> 00:09:05.841 results, and accurate, descriptive NOTE Confidence: 0.855384883666667 b5888e94-9dfb-407b-9f3b-26eba61d6819 00:09:05.841 --> 00:09:07.950 titles and descriptions are key to NOTE Confidence: 0.855384883666667 6ae0351b-3eef-4284-8f51-c25bd448f246 00:09:07.950 --> 00:09:09.615 having your content selected in NOTE Confidence: 0.855384883666667 39504acd-59cd-4005-b38e-fc558a0bff3e 00:09:09.620 --> 00:09:12.400 search results. NOTE Confidence: 0.855384883666667 d7f3516f-aea3-4c20-89da-dc374ac25f5f 00:09:12.400 --> 00:09:14.014 The title and description for each NOTE Confidence: 0.855384883666667 1d783c7c-9fdf-4388-85f4-0b4cf9dbec56 00:09:14.014 --> 00:09:15.445 document comes from the content NOTE Confidence: 0.855384883666667 3fd0d765-0b19-4a1a-b6aa-d58328f1ad9d 00:09:15.445 --> 00:09:17.239 entered in the title and description NOTE Confidence: 0.855384883666667 23543788-f78b-4a68-9a50-d6f7e96ec24b 00:09:17.240 --> 00:09:19.620 metadata fields in the document. NOTE Confidence: 0.855384883666667 a773d4e4-200f-410c-b51b-63cbf173d0aa 00:09:19.620 --> 00:09:21.696 We also use metadata field content NOTE Confidence: 0.855384883666667 7240baa1-71df-485d-af2e-6b492446941c 00:09:21.696 --> 00:09:24.148 to sort content into the facets on NOTE Confidence: 0.855384883666667 2914016b-1ee1-4a55-8c8b-a1256ea1111d 00:09:24.148 --> 00:09:26.294 the search results page. And metadata NOTE Confidence: 0.855384883666667 7298b520-8b87-483d-bf8a-16842e083efd 00:09:26.294 --> 00:09:28.736 is used in customized search forms. NOTE Confidence: 0.905901672 9d141f10-a62d-4f2c-86f9-ac280353597a 00:09:31.630 --> 00:09:33.930 Now let's talk about boosting. NOTE Confidence: 0.905901672 ff7cab5a-eaf0-4146-a5d6-3494b739ff92 00:09:33.930 --> 00:09:36.192 Earlier, we talked about the base NOTE Confidence: 0.905901672 ec9a277b-ecb4-40b3-b700-02159d2e0f42 00:09:36.192 --> 00:09:38.170 relevance score. To this score, NOTE Confidence: 0.905901672 a8cc6d1b-f0de-499f-878c-6cede68eb7c1 00:09:38.170 --> 00:09:39.780 additional boosts are applied. In NOTE Confidence: 0.905901672 a747a5d4-c107-47c4-99c6-4b87ab81b987 00:09:39.780 --> 00:09:42.173 the last slide, we talked about why NOTE Confidence: 0.905901672 9ef3cdf3-fe74-4ccf-ac24-0cca7ea49594 00:09:42.173 --> 00:09:43.988 good quality metadata is important. NOTE Confidence: 0.905901672 3c5431fc-5e80-4a42-92c1-b9a7518cf2d8 00:09:43.990 --> 00:09:46.146 But perhaps most important is that when NOTE Confidence: 0.905901672 c101b7a5-9237-4cea-9108-27c9ddd1ee4b 00:09:46.146 --> 00:09:48.288 the query term appears in the title, NOTE Confidence: 0.905901672 9de3b229-8c78-4c54-8c13-675b4daa1cc2 00:09:48.290 --> 00:09:49.592 description or URL, NOTE Confidence: 0.905901672 326e8300-203e-42cb-a480-4a160faf467c 00:09:49.592 --> 00:09:52.196 that document will receive an additional NOTE Confidence: 0.905901672 27033fd3-acca-4862-898f-9d5317eda3d2 00:09:52.196 --> 00:09:54.550 boost to the base relevance score. NOTE Confidence: 0.905901672 fd8acecc-dd2f-4c41-864a-8f95d8e53b19 00:09:54.550 --> 00:09:56.398 This means it'll appear closer to NOTE Confidence: 0.905901672 b21229fc-0da1-4fb1-b72e-773dd770ff1a 00:09:56.398 --> 00:09:58.590 the top of results for that query. NOTE Confidence: 0.905901672 7200cbaa-3c7d-401c-800f-3dfb901bcf3b 00:09:58.590 --> 00:10:00.045 Another type of boost applied NOTE Confidence: 0.905901672 190759ea-c741-4edd-ab23-41c627edf609 00:10:00.045 --> 00:10:01.534 is a signal's boost. NOTE Confidence: 0.905901672 cda527d8-cd8b-4c37-b471-5f36bb3bdc75 00:10:01.534 --> 00:10:04.718 What is a signal boost? Every time someone NOTE Confidence: 0.905901672 a07cfb09-daea-4823-a512-773e1ea8786a 00:10:04.718 --> 00:10:06.860 clicks on a document in search results, NOTE Confidence: 0.905901672 bd48a583-2f27-479a-a419-1db70d257019 00:10:06.860 --> 00:10:08.876 a signal is stored for that NOTE Confidence: 0.905901672 75a2dd34-5275-490d-8979-5019d59ca2db 00:10:08.876 --> 00:10:10.220 document, for that query. NOTE Confidence: 0.905901672 8437e662-b621-4551-9343-598a4134628f 00:10:10.220 --> 00:10:12.397 The more signals collected for a page, NOTE Confidence: 0.905901672 673f5347-215a-4260-8753-c60d35cce26c 00:10:12.400 --> 00:10:14.320 the bigger the boost. The NOTE Confidence: 0.905901672 c3d325be-9107-40b4-a091-4133a4ef1430 00:10:14.320 --> 00:10:16.240 clicks accumulate for the URL, NOTE Confidence: 0.905901672 e2fb30ba-b11d-4fb3-ad0f-0bbf32bb246b 00:10:16.240 --> 00:10:18.216 so if you move or redirect a page, NOTE Confidence: 0.905901672 02ee7545-3be8-4e45-95a0-1ce784e371e6 00:10:18.220 --> 00:10:20.784 it will start over with signals. NOTE Confidence: 0.905901672 8be470ee-02ce-4844-b866-2903f64ffab9 00:10:20.784 --> 00:10:23.052 Newly indexed pages are at a slight NOTE Confidence: 0.905901672 09801f3a-e097-4455-b819-168e1503e97c 00:10:23.052 --> 00:10:24.546 disadvantage because those pages NOTE Confidence: 0.905901672 9aee424b-e7b5-43c3-981f-7b82ef624964 00:10:24.546 --> 00:10:26.130 haven't accumulated signals yet. NOTE Confidence: 0.879119299090909 5a5bc3d5-b89b-4a7b-8f20-091b5a3a12ca 00:10:29.930 --> 00:10:31.292 To help you find what you're NOTE Confidence: 0.879119299090909 87ae2956-1d94-4e0b-a4e6-fb793ec08630 00:10:31.292 --> 00:10:32.450 looking for at the EPA, NOTE Confidence: 0.879119299090909 15ddb089-c0b0-478a-ac6d-db50a17c5c79 00:10:32.450 --> 00:10:34.350 we've gathered some best practices NOTE Confidence: 0.879119299090909 0cb5a760-832d-4461-8930-b3ff335b4d73 00:10:34.350 --> 00:10:35.870 when searching with Fusion. NOTE Confidence: 0.879119299090909 a50a7486-ac61-4102-bfde-73bd42b7f4c1 00:10:35.870 --> 00:10:38.708 One, be as specific as possible. NOTE Confidence: 0.879119299090909 14e2525c-ed67-4bc2-ad3b-767cd9cfa748 00:10:38.710 --> 00:10:40.985 We see a lot of broad, single NOTE Confidence: 0.879119299090909 083dc711-96a2-4c82-8c95-19a6f209f182 00:10:40.985 --> 00:10:42.630 word queries at the EPA. NOTE Confidence: 0.879119299090909 c3c7fa23-9cf7-44f2-aebd-70550b0a2f81 00:10:42.630 --> 00:10:45.227 A broad query will return broad results. NOTE Confidence: 0.879119299090909 65eedfca-6c7f-4118-b9b2-5703e711bbc9 00:10:45.230 --> 00:10:47.558 For example, if you're looking for NOTE Confidence: 0.879119299090909 6005de03-0c9d-41c2-a1ec-3a0899565323 00:10:47.558 --> 00:10:49.110 information about bereavement leave, NOTE Confidence: 0.879119299090909 58331695-ec22-42bb-847d-1db917c2fa6a 00:10:49.110 --> 00:10:50.646 query "bereavement leave" instead NOTE Confidence: 0.879119299090909 940dfc3c-e611-4a39-8224-4c6003cd10a1 00:10:50.646 --> 00:10:52.566 of just "leave". The query, NOTE Confidence: 0.879119299090909 ecbaf3b5-33eb-4df0-ac1b-24f64cc84d10 00:10:52.570 --> 00:10:54.614 "leave" will yield all kinds of documents NOTE Confidence: 0.879119299090909 bdd957fd-3326-4ae1-8d86-dd7381514729 00:10:54.614 --> 00:10:55.969 about different types of leave, NOTE Confidence: 0.879119299090909 0a80672e-d20c-4875-8139-669db173b618 00:10:55.970 --> 00:10:57.994 but also, leave is a common word and NOTE Confidence: 0.879119299090909 97f3cc11-c982-4102-88bb-52084deff6bf 00:10:57.994 --> 00:11:00.020 will match many, many documents. NOTE Confidence: 0.879119299090909 b28c623c-2369-4290-a2bc-e1f020ae31e4 00:11:00.020 --> 00:11:03.590 Two, use quotes with exact phrases. NOTE Confidence: 0.879119299090909 e1d37ec3-bc65-4c1b-802c-a83d11663537 00:11:03.590 --> 00:11:05.290 Fusion defaults to an OR NOTE Confidence: 0.879119299090909 1e48d504-cbce-4001-991c-f23808b1a968 00:11:05.290 --> 00:11:06.310 between query words, NOTE Confidence: 0.879119299090909 b295a531-375d-4ab8-b363-0918832e695c 00:11:06.310 --> 00:11:08.354 which means if you type the query NOTE Confidence: 0.879119299090909 f09e303f-6106-4d17-86b4-672de293e06c 00:11:08.354 --> 00:11:09.630 "bereavement leave" without quotes, NOTE Confidence: 0.879119299090909 a40fb9a4-13a1-4ea0-bcbb-adbc28572be8 00:11:09.630 --> 00:11:12.038 the search engine will look for documents NOTE Confidence: 0.879119299090909 fceb6e74-1fa8-47e5-8003-92c134dbaba9 00:11:12.038 --> 00:11:14.150 with "bereavement" or with "leave" in them. NOTE Confidence: 0.879119299090909 121a1b0f-15bb-444c-b58f-2281a057214a 00:11:14.150 --> 00:11:15.914 If you put bereavement leave in quotes, NOTE Confidence: 0.879119299090909 e7ca336b-4b43-4dc2-8341-32ddc2f8afff 00:11:15.920 --> 00:11:18.097 it will look for the words together. NOTE Confidence: 0.879119299090909 158345b9-7710-4b74-bf51-ca94ccd6819c 00:11:18.100 --> 00:11:20.940 3, Refine and try again. NOTE Confidence: 0.879119299090909 d3499fe9-591a-43fb-806c-685645ef4b7b 00:11:20.940 --> 00:11:23.680 Make observations about your results. NOTE Confidence: 0.879119299090909 df594ea3-6afd-4c3b-8a15-7b638a5d3c56 00:11:23.680 --> 00:11:25.188 Are they too broad? NOTE Confidence: 0.879119299090909 e20640ca-c61c-412c-9074-a3769760cd63 00:11:25.188 --> 00:11:26.696 Then add more terms. NOTE Confidence: 0.879119299090909 223740fe-fc65-49e6-a9bc-bb43664bebde 00:11:26.700 --> 00:11:28.128 Are they too precise? NOTE Confidence: 0.879119299090909 2901b30c-203c-405c-99e2-0326be97ae8d 00:11:28.128 --> 00:11:29.556 Then try fewer terms, NOTE Confidence: 0.879119299090909 18b6fe0b-ca88-4a5e-b398-a2222b2b52d3 00:11:29.560 --> 00:11:30.900 or perhaps different terms. NOTE Confidence: 0.879119299090909 394a6c95-4661-45ad-8f26-c26c2847a7d7 00:11:30.900 --> 00:11:33.275 Please see the Quick Guide to Searching NOTE Confidence: 0.879119299090909 7e8509db-e1b2-4dc2-9159-342a72f8aa28 00:11:33.275 --> 00:11:35.410 at the EPA for more specifics, and NOTE Confidence: 0.879119299090909 2f2d58f4-be53-4285-88d9-458c8c7715cc 00:11:35.410 --> 00:11:37.317 tips for searching at the EPA. NOTE Confidence: 0.883366319 0b990d1b-8039-4d33-9813-9308482d2c47 00:11:40.670 --> 00:11:42.272 You can also try the advanced NOTE Confidence: 0.883366319 aed5ec39-7e67-47e0-83f9-8062c066301d 00:11:42.272 --> 00:11:43.340 search form found above NOTE Confidence: 0.883366319 27d6e010-e725-489d-b9fc-011a10bfa14f 00:11:43.340 --> 00:11:47.140 The search box on the search results page. NOTE Confidence: 0.883366319 1fa785b3-dd22-42fd-9036-25cc7511d35f 00:11:47.140 --> 00:11:50.857 Expand the form by selecting "Advanced search". NOTE Confidence: 0.883366319 139f4089-a8ff-441d-ad95-c7484c05b5c9 00:11:50.860 --> 00:11:53.644 You can also narrow and filter results by NOTE Confidence: 0.883366319 85937bfb-f005-473d-8b05-8e285e516702 00:11:53.644 --> 00:11:56.737 using the facets on the search results page. NOTE Confidence: 0.883366319 c4e84ec2-e536-4dd4-9e1b-d63bd4b2fb36 00:11:56.740 --> 00:11:59.358 Please see our Quick Guide to Searching NOTE Confidence: 0.883366319 79b16ef7-c715-4133-8668-6d5d19244011 00:11:59.358 --> 00:12:02.358 at the EPA for more search tips. NOTE Confidence: 0.883366319 62803d26-1707-408b-b2ae-21ef8d87dd19 00:12:02.360 --> 00:12:05.510 Search results still not optimal? NOTE Confidence: 0.883366319 f34df6e5-e0dd-4c56-8f32-77d6b092fa74 00:12:05.510 --> 00:12:08.460 Then we make best bets. NOTE Confidence: 0.883366319 5b78f51c-a08b-4162-b2d1-43e3455a4f93 00:12:08.460 --> 00:12:10.686 Best bets are manually curated top NOTE Confidence: 0.883366319 96bc15c5-fa63-4a3c-b4ea-69e712f448c7 00:12:10.686 --> 00:12:13.020 results created by the search team. NOTE Confidence: 0.883366319 c4e36e34-8ae2-40e5-8215-043869e73d5a 00:12:13.020 --> 00:12:13.950 With best bets, NOTE Confidence: 0.883366319 70112757-94d7-444d-98c9-b90099ab6ff9 00:12:13.950 --> 00:12:17.007 we put specific URLs at the top of the NOTE Confidence: 0.883366319 92110a20-9d5e-4d6f-98b7-99690f9ab61f 00:12:17.007 --> 00:12:19.037 search results, for specific queries. NOTE Confidence: 0.883366319 683782ed-6bba-4163-a932-41603c2a6b23 00:12:19.040 --> 00:12:20.924 Regular query reviews, analytics- NOTE Confidence: 0.883366319 1b4b8d6f-1205-4639-9125-f5e8dadc2b5f 00:12:20.924 --> 00:12:23.750 report reviews, and search results analysis NOTE Confidence: 0.883366319 8eb1d4ce-d3d7-478b-836f-c57c519c40ff 00:12:23.818 --> 00:12:26.205 are conducted to ensure users are finding NOTE Confidence: 0.883366319 3f90194c-f426-43b6-90c3-5768cf190e0d 00:12:26.205 --> 00:12:28.718 what they are looking for at the EPA. NOTE Confidence: 0.883366319 d7017259-5513-459a-9336-ce9cb0571b69 00:12:28.720 --> 00:12:29.926 If the results are not as NOTE Confidence: 0.883366319 e7dbab5f-d947-481f-92e2-93ee1b129769 00:12:29.926 --> 00:12:30.920 good as they could be, NOTE Confidence: 0.883366319 326f34a4-6a63-4d6d-9bbb-3fd5e4a190b6 00:12:30.920 --> 00:12:32.540 best bets are created. NOTE Confidence: 0.883366319 b6e48c83-1f78-4ec2-a93a-c6c82a9ecb3d 00:12:32.540 --> 00:12:35.426 Best bets undergo a quarterly review and NOTE Confidence: 0.883366319 4d061296-fada-4499-ad8a-43f53cdcd837 00:12:35.426 --> 00:12:38.072 are removed once the URL is organically NOTE Confidence: 0.883366319 88a0d68d-4aaf-4745-9801-6bf7c72bd63f 00:12:38.072 --> 00:12:40.620 ranked high for the specific queries. NOTE Confidence: 0.883366319 1206a05b-7f21-4980-930c-e1aef8541a16 00:12:40.620 --> 00:12:43.014 To request a best bet, send e-mail NOTE Confidence: 0.883366319 cbcc0aa1-c8d3-4f0a-b80b-61f4c79be501 00:12:43.014 --> 00:12:45.516 to Cathy Edstrom and Alison Shahan. NOTE Confidence: 0.883366319 0de56ee9-07be-4872-b7da-358e3389aeec 00:12:45.520 --> 00:12:48.194 Include the query terms, the page title, NOTE Confidence: 0.883366319 2dd4ca64-5ca4-4b97-8c67-b5a2db064386 00:12:48.200 --> 00:12:51.700 URL, and the description. NOTE Confidence: 0.883366319 3d7f61e7-666f-40d3-ba64-53d61502fc17 00:12:51.700 --> 00:12:53.944 Your content will be reviewed and NOTE Confidence: 0.883366319 0dccb364-9860-435e-b905-ce5de9b5e92a 00:12:53.944 --> 00:12:55.966 we'll make any content suggestions NOTE Confidence: 0.883366319 1ece08bf-f05b-47c9-8414-6b480d773d5d 00:12:55.966 --> 00:12:58.246 before creating the best bet. NOTE Confidence: 0.883366319 fd368108-1595-4624-857b-9703716d4d1d 00:12:58.250 --> 00:12:59.834 Find out more about best bets NOTE Confidence: 0.883366319 5e77fcc4-0b02-4480-bc2b-7fd4f970be30 00:12:59.834 --> 00:13:01.193 on the Improving Search Results NOTE Confidence: 0.883366319 c0542264-2f0a-4e8f-9a10-0e3b7ad3b526 00:13:01.193 --> 00:13:02.438 page in the web guide. NOTE Confidence: 0.844717710588235 0fdcf9a9-6ddb-4552-a9f9-8d92816c53ae 00:13:05.070 --> 00:13:07.422 Search forms. Search forms are filters NOTE Confidence: 0.844717710588235 6976bf5f-6b65-4c39-b0bb-e9fb2cb1a18b 00:13:07.422 --> 00:13:09.796 that restrict the search to pages NOTE Confidence: 0.844717710588235 bf28713f-10dd-4aef-8030-2bbf1ac18b65 00:13:09.796 --> 00:13:11.646 that meet the specific criteria. NOTE Confidence: 0.844717710588235 f5303edd-fb4a-4ebd-8765-94a19ecce064 00:13:11.650 --> 00:13:13.420 For example, it is possible to NOTE Confidence: 0.844717710588235 7500f2e0-2112-4f3b-b16d-09fb538c17cb 00:13:13.420 --> 00:13:15.255 create a search box that will NOTE Confidence: 0.844717710588235 b41ecde6-63e7-44f7-aa10-0772f9be3920 00:13:15.255 --> 00:13:17.025 only search a single web area. NOTE Confidence: 0.844717710588235 b018c331-2f14-4193-a0e9-0b8fb3945709 00:13:17.030 --> 00:13:18.806 You can choose any metadata field NOTE Confidence: 0.844717710588235 9cd840c7-fd79-4eec-a5ed-eb708f8bc481 00:13:18.806 --> 00:13:21.446 to filter on web area, URL, title, NOTE Confidence: 0.844717710588235 4d7ab882-d4b0-4134-a478-d5979ed174fa 00:13:21.446 --> 00:13:24.550 or content type, to name just a few. NOTE Confidence: 0.844717710588235 c67abf7c-4187-49f8-b0aa-3ef558692fa6 00:13:24.550 --> 00:13:26.314 See the documentation in the web NOTE Confidence: 0.844717710588235 fb27d24a-bbd1-47a3-94e5-f995220775b0 00:13:26.314 --> 00:13:28.289 guide for more info and examples. NOTE Confidence: 0.956926418 449b53e7-66a6-4161-b545-5913adbde781 00:13:31.180 --> 00:13:32.810 Thank you for joining us. NOTE Confidence: 0.956926418 9340c690-48f0-4af5-8d2b-9adb740f7c05 00:13:32.810 --> 00:13:34.620 This is a checklist with NOTE Confidence: 0.956926418 f8e03131-56dd-4558-9aa3-2707196f2e93 00:13:34.620 --> 00:13:36.068 takeaways from this session. NOTE Confidence: 0.956926418 300ac1fa-d82f-4713-b229-1f172b50a572 00:13:36.070 --> 00:13:37.710 1, Use Google Analytics, NOTE Confidence: 0.956926418 5781707e-2bc9-40dd-b463-98c0fa55a145 00:13:37.710 --> 00:13:39.350 Click through reports, and NOTE Confidence: 0.956926418 45cac497-9546-4d98-bd7d-76a6d21aa9f4 00:13:39.350 --> 00:13:41.414 query logs to understand how NOTE Confidence: 0.956926418 3feca0ff-3381-45f2-b194-efd6d15d0c71 00:13:41.414 --> 00:13:43.369 users search for your content. NOTE Confidence: 0.956926418 6637d750-176f-4789-a890-7700616edd7d 00:13:43.370 --> 00:13:45.848 2, Follow EPA guidance on writing NOTE Confidence: 0.956926418 ba87632a-9734-434f-ad0e-d72f1b9c2b22 00:13:45.848 --> 00:13:48.564 high-quality content. 3, Populate NOTE Confidence: 0.956926418 8432a3a7-6971-4f37-8d76-cb69c40e8a4f 00:13:48.564 --> 00:13:51.044 metadata fields with page-specific, NOTE Confidence: 0.956926418 91009210-2fd9-4823-b24f-102307856a25 00:13:51.050 --> 00:13:52.238 not-generic language. NOTE Confidence: 0.956926418 b473a694-4394-43e8-8867-4148c5d9af41 00:13:52.238 --> 00:13:54.614 Keep in mind, title and description NOTE Confidence: 0.956926418 e227aa87-9279-474d-953a-edb1dcabbe0c 00:13:54.614 --> 00:13:56.781 appear in search results and boosts NOTE Confidence: 0.956926418 5bc86740-c943-4238-989f-2ad79bbe5a9f 00:13:56.781 --> 00:13:59.017 are applied to these metadata NOTE Confidence: 0.956926418 3bc0f23e-6434-4c56-bbbf-41e20950aebf 00:13:59.017 --> 00:14:01.920 fields. 4. Use search best practices, NOTE Confidence: 0.956926418 2c8a046c-582f-481c-a643-41d68728f08a 00:14:01.920 --> 00:14:04.410 use quotes on phrases, NOTE Confidence: 0.956926418 b4a1699c-9de6-4acd-8537-b7f51df5f303 00:14:04.410 --> 00:14:07.147 use fielded searches, and advanced search. NOTE Confidence: 0.929784249411765 81f16b85-b75a-4e75-bbe2-992985d026c1 00:14:09.790 --> 00:14:12.072 Questions? Feel free to reach out via NOTE Confidence: 0.929784249411765 80de58fa-485d-41fb-8605-bee179fbac05 00:14:12.072 --> 00:14:14.806 e-mail to the search team with any NOTE Confidence: 0.929784249411765 f90f99bf-b232-4dc3-96e5-fd3550d4ae32 00:14:14.806 --> 00:14:16.720 search-related questions. Thank you.