ENH: Skip costly `is_unique` call while creating `Categorical` arrays

### Feature Type

- [x] Adding new functionality to pandas

- [ ] Changing existing functionality in pandas

- [ ] Removing existing functionality in pandas


### Problem Description

I often create `Categorical` data structures.  In certain circumstances the number of unique categories can be quite large -- the overall length of the `Categorical` can be very long indeed (hundreds of millions of records). I always create these arrays using the `Categorical.from_codes` path for performance (my codes are stored in a `numpy` array). Even still... I would like to bypass an expensive `is_unique` call that is made during the creation of the categories.

My simple (and somewhat contrived) example:

```
arr = np.array(list(range(10_000_000)) * 10, dtype=np.int32, order="C")
cats = [f"a{i}" for i in range(10_000_000)]
pd.Categorical.from_codes(codes=arr, categories=cats, validate=False)
```

shows with `cProfile`:

```
   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    1.877    1.877    1.877    1.877 base.py:2313(is_unique)
       93    1.539    0.017    1.539    0.017 {built-in method numpy.array}
        1    0.709    0.709    4.574    4.574 extract_test.py:1(<module>)
        1    0.120    0.120    0.120    0.120 missing.py:305(_isna_string_dtype)
        4    0.092    0.023    0.098    0.024 cast.py:1579(construct_1d_object_array_from_listlike)
        4    0.032    0.008    0.131    0.033 construction.py:517(sanitize_array)
```


Checking that the categories are unique take a large chunk of time. I've tried to bypass the public API in order to avoid this `is_unique` call, but keep on running into trouble.  And... generally... I would like to stick to public features only.  I know with certainty that my categories are unique.

### Feature Description

There could be a couple solutions here:

1) Perhaps someone knows how to create a `Categorical` array very fast assuming that I have pristine data (no Nans, or bad codes, plus guaranteed unique categories)?  I'd welcome a solution with current methods!

2) If no solution is currently available, perhaps a new `is_unique` argument could be introduced to the `Categorical.from_codes` `classmethod` (with a safe default of `False`)?  The user could turn this on at their own peril.  This doesn't seem to be without precedence:

```
validate : bool, default True

If True, validate that the codes are valid for the dtype.

If False, don't validate that the codes are valid. Be careful about skipping validation, as invalid codes can lead to severe problems, such as segfaults.

```

I'm willing risk segfaults for speed.

Many hats off to the pandas team/community.  I appreciate your hard work!


### Alternative Solutions

not aware of any other package that would satisfy the goal here

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Skip costly `is_unique` call while creating `Categorical` arrays #60981

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ENH: Skip costly is_unique call while creating Categorical arrays #60981

Description

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

ENH: Skip costly `is_unique` call while creating `Categorical` arrays #60981