OXIESEC PANEL
- Current Dir:
/
/
opt
/
gsutil
/
third_party
/
chardet
/
chardet
/
__pycache__
Server IP: 2a02:4780:11:1594:0:ef5:22d7:a
Upload:
Create Dir:
Name
Size
Modified
Perms
📁
..
-
02/11/2025 08:19:49 AM
rwxr-xr-x
📄
__init__.cpython-39.pyc
3.04 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
big5freq.cpython-39.pyc
26.49 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
big5prober.cpython-39.pyc
1.08 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
chardistribution.cpython-39.pyc
6.98 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
charsetgroupprober.cpython-39.pyc
2.36 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
charsetprober.cpython-39.pyc
3.71 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
codingstatemachine.cpython-39.pyc
2.96 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
codingstatemachinedict.cpython-39.pyc
642 bytes
02/11/2025 08:19:49 AM
rw-r--r--
📄
cp949prober.cpython-39.pyc
1.09 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
enums.cpython-39.pyc
2.61 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
escprober.cpython-39.pyc
2.71 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
escsm.cpython-39.pyc
7 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
eucjpprober.cpython-39.pyc
2.51 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
euckrfreq.cpython-39.pyc
11.73 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
euckrprober.cpython-39.pyc
1.09 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
euctwfreq.cpython-39.pyc
26.5 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
euctwprober.cpython-39.pyc
1.09 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
gb2312freq.cpython-39.pyc
18.61 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
gb2312prober.cpython-39.pyc
1.1 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
hebrewprober.cpython-39.pyc
3.29 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
jisfreq.cpython-39.pyc
21.57 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
johabfreq.cpython-39.pyc
36.44 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
johabprober.cpython-39.pyc
1.09 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
jpcntx.cpython-39.pyc
37.02 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
langbulgarianmodel.cpython-39.pyc
21.24 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
langgreekmodel.cpython-39.pyc
19.95 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
langhebrewmodel.cpython-39.pyc
20.01 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
langrussianmodel.cpython-39.pyc
25.69 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
langthaimodel.cpython-39.pyc
20.19 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
langturkishmodel.cpython-39.pyc
20.03 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
latin1prober.cpython-39.pyc
2.97 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
macromanprober.cpython-39.pyc
3.11 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
mbcharsetprober.cpython-39.pyc
2.22 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
mbcsgroupprober.cpython-39.pyc
1.18 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
mbcssm.cpython-39.pyc
17.14 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
resultdict.cpython-39.pyc
522 bytes
02/11/2025 08:19:49 AM
rw-r--r--
📄
sbcharsetprober.cpython-39.pyc
3.57 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
sbcsgroupprober.cpython-39.pyc
1.65 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
sjisprober.cpython-39.pyc
2.54 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
universaldetector.cpython-39.pyc
6.93 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
utf1632prober.cpython-39.pyc
6.06 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
utf8prober.cpython-39.pyc
2.02 KB
02/11/2025 08:19:49 AM
rw-r--r--
📄
version.cpython-39.pyc
391 bytes
02/11/2025 08:19:49 AM
rw-r--r--
Editing: charsetprober.cpython-39.pyc
Close
a Y�d, � @ sL d dl Z d dlZd dlmZmZ ddlmZmZ e�d�Z G dd� d�Z dS )� N)�Optional�Union� )�LanguageFilter�ProbingStates% [a-zA-Z]*[�-�]+[a-zA-Z]*[^a-zA-Z�-�]?c @ s� e Zd ZdZejfedd�dd�Zdd�dd�Zee e d�d d ��Zee e d�dd��Ze eef ed �dd�Zeed�dd��Zed�dd�Zee eef ed�dd��Zee eef ed�dd��Zee eef ed�dd��ZdS )� CharSetProbergffffff�?N)�lang_filter�returnc C s$ t j| _d| _|| _t�t�| _d S )NT) r � DETECTING�_state�activer �logging� getLogger�__name__�logger)�selfr � r �8/opt/gsutil/third_party/chardet/chardet/charsetprober.py�__init__, s zCharSetProber.__init__)r c C s t j| _d S �N)r r r �r r r r �reset2 s zCharSetProber.resetc C s d S r r r r r r �charset_name5 s zCharSetProber.charset_namec C s t �d S r ��NotImplementedErrorr r r r �language9 s zCharSetProber.language)�byte_strr c C s t �d S r r )r r r r r �feed= s zCharSetProber.feedc C s | j S r )r r r r r �state@ s zCharSetProber.statec C s dS )Ng r r r r r �get_confidenceD s zCharSetProber.get_confidence)�bufr c C s t �dd| �} | S )Ns ([ -])+� )�re�sub)r r r r �filter_high_byte_onlyG s z#CharSetProber.filter_high_byte_onlyc C sZ t � }t�| �}|D ]@}|�|dd� � |dd� }|�� sJ|dk rJd}|�|� q|S )u7 We define three types of bytes: alphabet: english alphabets [a-zA-Z] international: international characters [-ÿ] marker: everything else [^a-zA-Z-ÿ] The input buffer can be thought to contain a series of words delimited by markers. This function works to filter all words that contain at least one international character. All contiguous sequences of markers are replaced by a single space ascii character. This filter applies to all scripts which do not use English characters. N���� �r! )� bytearray�INTERNATIONAL_WORDS_PATTERN�findall�extend�isalpha)r �filtered�words�wordZ last_charr r r �filter_international_wordsL s z(CharSetProber.filter_international_wordsc C s� t � }d}d}t| ��d�} t| �D ]R\}}|dkrB|d }d}q$|dkr$||krr|sr|�| ||� � |�d� d}q$|s�|�| |d � � |S ) a[ Returns a copy of ``buf`` that retains only the sequences of English alphabet and high byte characters that are not between <> characters. This filter can be applied to all scripts which contain both English characters and extended ASCII characters, but is currently only used by ``Latin1Prober``. Fr �c� >r � <r! TN)r'