Skip to content

feat: Add configurable search filters for deprecated packages and minimum publish time#859

Draft
Copilot wants to merge 4 commits intomasterfrom
copilot/add-filter-for-deprecated-packages
Draft

feat: Add configurable search filters for deprecated packages and minimum publish time#859
Copilot wants to merge 4 commits intomasterfrom
copilot/add-filter-for-deprecated-packages

Conversation

Copy link
Contributor

Copilot AI commented Oct 24, 2025

Overview

Implements search filtering features to align with npm registry behavior, allowing administrators to filter deprecated packages and set minimum publish time requirements for packages to appear in search results.

Motivation

As documented in npm's search documentation:

Please note that newly published packages may take up to two weeks to appear in the search results. Additionally, deprecated packages are excluded from the search results to enhance the user experience.

This PR implements similar functionality for cnpmcore to:

  1. Improve search result quality by filtering out deprecated packages
  2. Allow administrators to control when newly published packages appear in search
  3. Provide flexible configuration for different deployment scenarios

Changes

1. Deprecated Package Filtering

Default Behavior: Deprecated packages are excluded from search results (enabled by default)

Configuration:

# Enable filtering (default)
export CNPMCORE_CONFIG_SEARCH_FILTER_DEPRECATED=true

# Disable to show all packages including deprecated ones
export CNPMCORE_CONFIG_SEARCH_FILTER_DEPRECATED=false

The filter excludes packages where the deprecated field exists and is not empty in the latest version, matching npm's behavior.

2. Minimum Publish Time Filtering

Default Behavior: No time filtering (packages appear immediately)

Configuration:

# Only show packages published more than 2 weeks ago
export CNPMCORE_CONFIG_SEARCH_MIN_PUBLISH_TIME=2w

# Other examples
export CNPMCORE_CONFIG_SEARCH_MIN_PUBLISH_TIME=1d   # 1 day
export CNPMCORE_CONFIG_SEARCH_MIN_PUBLISH_TIME=24h  # 24 hours

Supported time units:

  • h - hours
  • d - days
  • w - weeks

Implementation Details

Architecture

The implementation follows the existing DDD architecture:

Configuration Layer:

  • Added searchFilterDeprecated and searchMinPublishTime to CnpmcoreConfig interface
  • Default values set in config.default.ts with environment variable support

Data Layer:

  • Added deprecated field to SearchMappingType and PackageJSONType
  • Field is synced from package manifests to Elasticsearch during package sync

Service Layer:

  • PackageSearchService._buildFilterQueries(): Constructs Elasticsearch filter array based on configuration
  • PackageSearchService._parseMinPublishTime(): Parses time format strings into Date objects
  • Filters are applied as Elasticsearch filter clauses in the bool query

Elasticsearch Query Structure

Filters are applied using Elasticsearch's filter clause for optimal performance:

{
  query: {
    function_score: {
      query: {
        bool: {
          should: [...matchQueries],
          filter: [
            // Deprecated filter (if enabled)
            {
              bool: {
                should: [
                  { bool: { must_not: { exists: { field: 'package.deprecated' } } } },
                  { term: { 'package.deprecated': '' } }
                ]
              }
            },
            // Time filter (if configured)
            {
              range: {
                'package.date': {
                  lte: '<calculated_cutoff_date>'
                }
              }
            }
          ]
        }
      }
    }
  }
}

Test Coverage

Added comprehensive test coverage (5 new test cases):

  1. ✅ Verifies deprecated filtering is enabled by default
  2. ✅ Verifies deprecated filtering can be disabled via config
  3. ✅ Tests minimum publish time filtering with "2w" format
  4. ✅ Tests all supported time formats (24h, 7d, 2w)
  5. ✅ Tests end-to-end: publish deprecated package → sync to Elasticsearch → verify field stored

Documentation

  • Updated .env.example with clear documentation and examples
  • Added inline code comments explaining filter logic
  • Configuration options documented in TypeScript interfaces

Breaking Changes

None. All new features are:

  • Opt-in for minimum publish time filtering (default: off)
  • Opt-out for deprecated filtering (default: on, can be disabled)
  • Backwards compatible with existing search behavior when defaults are used

Migration Guide

No migration required. To use the new features:

  1. Enable deprecated filtering (default): No action needed
  2. Disable deprecated filtering: Set CNPMCORE_CONFIG_SEARCH_FILTER_DEPRECATED=false
  3. Add minimum publish time: Set CNPMCORE_CONFIG_SEARCH_MIN_PUBLISH_TIME=2w (or desired duration)

Fixes #[issue-number]

Original prompt

This section details on the original issue you should resolve

<issue_title>搜索需要过滤 deprecated 的版本信息</issue_title>
<issue_description>https://docs.npmjs.com/searching-for-and-choosing-packages-to-download

Please note that newly published packages may take up to two weeks to appear in the search results. Additionally, deprecated packages are excluded from the search results to enhance the user experience.

需要实现的需求:

  • 支持通过配置设置发布超过 2 周的包才能被搜索出来,默认配置是空,支持自定义时间设置,可以实现按小时、按天、按周
  • 支持通过配置设置开启和关闭 deprecated 包搜索过滤,默认配置是过滤 deprecated 的包信息
    </issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #858


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@graphite-app
Copy link

graphite-app bot commented Oct 24, 2025

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • MQ - adds this PR to the back of the merge queue
  • Hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 24, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI and others added 3 commits October 24, 2025 06:32
…time

Co-authored-by: fengmk2 <156269+fengmk2@users.noreply.github.com>
Co-authored-by: fengmk2 <156269+fengmk2@users.noreply.github.com>
Co-authored-by: fengmk2 <156269+fengmk2@users.noreply.github.com>
Copilot AI changed the title [WIP] Add filtering options for deprecated package searches feat: Add configurable search filters for deprecated packages and minimum publish time Oct 24, 2025
Copilot AI requested a review from fengmk2 October 24, 2025 06:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

搜索需要过滤 deprecated 的版本信息

2 participants