dlk.core.layers.embeddings package
Submodules
dlk.core.layers.embeddings.combine_word_char_cnn module
- class dlk.core.layers.embeddings.combine_word_char_cnn.CombineWordCharCNNEmbedding(config: dlk.core.layers.embeddings.combine_word_char_cnn.CombineWordCharCNNEmbeddingConfig)[source]
Bases:
dlk.core.base_module.SimpleModule
from ‘input_ids’ and ‘char_ids’ generate ‘embedding’
- forward(inputs: Dict[str, torch.Tensor]) Dict[str, torch.Tensor] [source]
get the combine char and word embedding
- Parameters
inputs – one mini-batch inputs
- Returns
one mini-batch outputs
- init_weight(method: Callable)[source]
init the weight of submodules by ‘method’
- Parameters
method – init method
- Returns
None
- training: bool
- class dlk.core.layers.embeddings.combine_word_char_cnn.CombineWordCharCNNEmbeddingConfig(config: Dict)[source]
Bases:
dlk.core.base_module.BaseModuleConfig
Config for CombineWordCharCNNEmbedding
- Config Example:
>>> { >>> "_name": "combine_word_char_cnn", >>> "embedding@char": { >>> "_base": "static_char_cnn", >>> }, >>> "embedding@word": { >>> "_base": "static", >>> }, >>> "config": { >>> "word": { >>> "embedding_file": "*@*", //the embedding file, must be saved as numpy array by pickle >>> "embedding_dim": "*@*", >>> "embedding_trace": ".", //default the file itself is the embedding >>> "freeze": false, // is freeze >>> "padding_idx": 0, //dropout rate >>> "output_map": {"embedding": "word_embedding"}, >>> "input_map": {}, // required_key: provide_key >>> }, >>> "char": { >>> "embedding_file": "*@*", //the embedding file, must be saved as numpy array by pickle >>> "embedding_dim": 35, //dropout rate >>> "embedding_trace": ".", //default the file itself is the embedding >>> "freeze": false, // is freeze >>> "kernel_sizes": [3], //dropout rate >>> "padding_idx": 0, >>> "output_map": {"char_embedding": "char_embedding"}, >>> "input_map": {"char_ids": "char_ids"}, >>> }, >>> "dropout": 0, //dropout rate >>> "embedding_dim": "*@*", // this must equal to char.embedding_dim + word.embedding_dim >>> "output_map": {"embedding": "embedding"}, // this config do nothing, you can change this >>> "input_map": {"char_embedding": "char_embedding", 'word_embedding': "word_embedding"}, // if the output of char and word embedding changed, you also should change this >>> }, >>> "_link":{ >>> "config.word.embedding_file": ["embedding@word.config.embedding_file"], >>> "config.word.embedding_dim": ["embedding@word.config.embedding_dim"], >>> "config.word.embedding_trace": ["embedding@word.config.embedding_trace"], >>> "config.word.freeze": ["embedding@word.config.freeze"], >>> "config.word.padding_idx": ["embedding@word.config.padding_idx"], >>> "config.word.output_map": ["embedding@word.config.output_map"], >>> "config.word.input_map": ["embedding@word.config.input_map"], >>> "config.char.embedding_file": ["embedding@char.config.embedding_file"], >>> "config.char.embedding_dim": ["embedding@char.config.embedding_dim"], >>> "config.char.embedding_trace": ["embedding@char.config.embedding_trace"], >>> "config.char.freeze": ["embedding@char.config.freeze"], >>> "config.char.kernel_sizes": ["embedding@char.config.kernel_sizes"], >>> "config.char.padding_idx": ["embedding@char.config.padding_idx"], >>> "config.char.output_map": ["embedding@char.config.output_map"], >>> "config.char.input_map": ["embedding@char.config.input_map"], >>> }, >>> }
dlk.core.layers.embeddings.identity module
- class dlk.core.layers.embeddings.identity.IdentityEmbedding(config: dlk.core.layers.embeddings.identity.IdentityEmbeddingConfig)[source]
Bases:
dlk.core.base_module.SimpleModule
Do nothing
- forward(inputs: Dict[str, torch.Tensor]) Dict[str, torch.Tensor] [source]
return inputs
- Parameters
inputs – anything
- Returns
inputs
- training: bool
- class dlk.core.layers.embeddings.identity.IdentityEmbeddingConfig(config)[source]
Bases:
dlk.core.base_module.BaseModuleConfig
Config for IdentityEmbedding
- Config Example:
>>> { >>> "config": { >>> }, >>> "_name": "identity", >>> }
dlk.core.layers.embeddings.pretrained_transformers module
- class dlk.core.layers.embeddings.pretrained_transformers.PretrainedTransformers(config: dlk.core.layers.embeddings.pretrained_transformers.PretrainedTransformersConfig)[source]
Bases:
dlk.core.base_module.SimpleModule
Wrap the hugingface transformers
- forward(inputs: Dict[str, torch.Tensor]) Dict[str, torch.Tensor] [source]
get the transformers output as embedding
- Parameters
inputs – one mini-batch inputs
- Returns
one mini-batch outputs
- init_weight(method)[source]
init the weight of submodules by ‘method’
- Parameters
method – init method
- Returns
None
- training: bool
- class dlk.core.layers.embeddings.pretrained_transformers.PretrainedTransformersConfig(config: Dict)[source]
Bases:
dlk.core.base_module.BaseModuleConfig
Config for PretrainedTransformers
- Config Example1:
>>> { >>> "module": { >>> "_base": "roberta", >>> }, >>> "config": { >>> "pretrained_model_path": "*@*", >>> "input_map": { >>> "input_ids": "input_ids", >>> "attention_mask": "attention_mask", >>> "type_ids": "type_ids", >>> }, >>> "output_map": { >>> "embedding": "embedding", >>> }, >>> "dropout": 0, //dropout rate >>> "embedding_dim": "*@*", >>> }, >>> "_link": { >>> "config.pretrained_model_path": ["module.config.pretrained_model_path"], >>> }, >>> "_name": "pretrained_transformers", >>> }
- Config Example2:
>>> for gather embedding >>> { >>> "module": { >>> "_base": "roberta", >>> }, >>> "config": { >>> "pretrained_model_path": "*@*", >>> "input_map": { >>> "input_ids": "input_ids", >>> "attention_mask": "subword_mask", >>> "type_ids": "type_ids", >>> "gather_index": "gather_index", >>> }, >>> "output_map": { >>> "embedding": "embedding", >>> }, >>> "embedding_dim": "*@*", >>> "dropout": 0, //dropout rate >>> }, >>> "_link": { >>> "config.pretrained_model_path": ["module.config.pretrained_model_path"], >>> }, >>> "_name": "pretrained_transformers", >>> }
dlk.core.layers.embeddings.random module
- class dlk.core.layers.embeddings.random.RandomEmbedding(config: dlk.core.layers.embeddings.random.RandomEmbeddingConfig)[source]
Bases:
dlk.core.base_module.SimpleModule
from ‘input_ids’ generate ‘embedding’
- forward(inputs: Dict[str, torch.Tensor]) Dict[str, torch.Tensor] [source]
get the random embedding
- Parameters
inputs – one mini-batch inputs
- Returns
one mini-batch outputs
- init_weight(method: Callable)[source]
init the weight of submodules by ‘method’
- Parameters
method – init method
- Returns
None
- training: bool
- class dlk.core.layers.embeddings.random.RandomEmbeddingConfig(config: Dict)[source]
Bases:
dlk.core.base_module.BaseModuleConfig
Config for RandomEmbedding
- Config Example:
>>> { >>> "config": { >>> "vocab_size": "*@*", >>> "embedding_dim": "*@*", >>> "dropout": 0, //dropout rate >>> "padding_idx": 0, //dropout rate >>> "output_map": {}, >>> "input_map": {}, >>> }, >>> "_name": "random", >>> }
dlk.core.layers.embeddings.static module
- class dlk.core.layers.embeddings.static.StaticEmbedding(config: dlk.core.layers.embeddings.static.StaticEmbeddingConfig)[source]
Bases:
dlk.core.base_module.SimpleModule
from ‘input_ids’ generate static ‘embedding’ like glove, word2vec
- forward(inputs: Dict[str, torch.Tensor]) Dict[str, torch.Tensor] [source]
get the pretrained static embedding like glove word2vec
- Parameters
inputs – one mini-batch inputs
- Returns
one mini-batch outputs
- init_weight(method)[source]
init the weight of submodules by ‘method’
- Parameters
method – init method
- Returns
None
- training: bool
- class dlk.core.layers.embeddings.static.StaticEmbeddingConfig(config: Dict)[source]
Bases:
dlk.core.base_module.BaseModuleConfig
Config for StaticEmbedding
- Config Example:
>>> { >>> "config": { >>> "embedding_file": "*@*", //the embedding file, must be saved as numpy array by pickle >>> "embedding_dim": "*@*", >>> //if the embedding_file is a dict, you should provide the dict trace to embedding >>> "embedding_trace": ".", //default the file itself is the embedding >>> /*embedding_trace: "embedding", //this means the <embedding = pickle.load(embedding_file)["embedding"]>*/ >>> /*embedding_trace: "meta.embedding", //this means the <embedding = pickle.load(embedding_file)['meta']["embedding"]>*/ >>> "freeze": false, // is freeze >>> "padding_idx": 0, //dropout rate >>> "dropout": 0, //dropout rate >>> "output_map": {}, >>> "input_map": {}, // required_key: provide_key >>> }, >>> "_name": "static", >>> }
dlk.core.layers.embeddings.static_char_cnn module
- class dlk.core.layers.embeddings.static_char_cnn.StaticCharCNNEmbedding(config: dlk.core.layers.embeddings.static_char_cnn.StaticCharCNNEmbeddingConfig)[source]
Bases:
dlk.core.base_module.SimpleModule
from ‘char_ids’ generate ‘embedding’
- forward(inputs: Dict[str, torch.Tensor]) Dict[str, torch.Tensor] [source]
fit the char embedding to cnn and pool to word_embedding
- Parameters
inputs – one mini-batch inputs
- Returns
one mini-batch outputs
- init_weight(method)[source]
init the weight of submodules by ‘method’
- Parameters
method – init method
- Returns
None
- training: bool
- class dlk.core.layers.embeddings.static_char_cnn.StaticCharCNNEmbeddingConfig(config: Dict)[source]
Bases:
dlk.core.base_module.BaseModuleConfig
Config for StaticCharCNNEmbedding
- Config Example:
>>> { >>> "module@cnn": { >>> "_base": "conv1d", >>> config: { >>> in_channels: -1, >>> out_channels: -1, //will update while load embedding >>> kernel_sizes: [3], >>> }, >>> }, >>> "config": { >>> "embedding_file": "*@*", //the embedding file, must be saved as numpy array by pickle >>> //if the embedding_file is a dict, you should provide the dict trace to embedding >>> "embedding_trace": ".", //default the file itself is the embedding >>> /*embedding_trace: "char_embedding", //this means the <embedding = pickle.load(embedding_file)["char_embedding"]>*/ >>> /*embedding_trace: "meta.char_embedding", //this means the <embedding = pickle.load(embedding_file)['meta']["char_embedding"]>*/ >>> "freeze": false, // is freeze >>> "dropout": 0, //dropout rate >>> "embedding_dim": 35, //dropout rate >>> "kernel_sizes": [3], //dropout rate >>> "padding_idx": 0, >>> "output_map": {"char_embedding": "char_embedding"}, >>> "input_map": {"char_ids": "char_ids"}, >>> }, >>> "_link":{ >>> "config.embedding_dim": ["module@cnn.config.in_channels", "module@cnn.config.out_channels"], >>> "config.kernel_sizes": ["module@cnn.config.kernel_sizes"], >>> }, >>> "_name": "static_char_cnn", >>> }
Module contents
embeddings
- class dlk.core.layers.embeddings.EmbeddingInput(**args)[source]
Bases:
object
docstring for EmbeddingInput