为了账号安全,请及时绑定邮箱和手机立即绑定

优化这种 N 个权重共享 Keras 模型的堆叠

优化这种 N 个权重共享 Keras 模型的堆叠

猛跑小猪 2021-06-29 17:58:00
我有两个 Keras(Tensorflow 后端)模型,它们堆叠在一起形成一个组合模型:small_model 和 In: (None,K), Out: (None,K)large_model 和 In: (None,N,K), Out: (None,1)combined_model( N x small_model -> large_model) 与In: (None,N,K), Out: (None,1)large_model需要N堆叠输出small_model作为输入。我可以定义N small_model共享权重的 s,然后连接它们的输出(从技术上讲,我需要堆叠它们),然后将其发送到large_model,如下面的代码所示。我的问题是我需要能够为非常大的N( > 10**6)执行此操作,并且我当前的解决方案在创建模型时使用了大量内存和时间,即使对于N ~ 10**2.我希望有一个解决方案可以并行发送N数据点small_model(就像将批次提供给模型时所做的那样),收集这些点(使用 Keras 历史,以便反向传播是可能的)并将其发送到large_model,而不必定义 的N实例small_model。列出的三个模型的输入和输出形状不应改变,但当然可以定义其他中间模型。当前不满意的解决方案(假设small_model和large_model已经存在,并且N,K已定义):from keras.layers import Input, Lambdafrom keras.models import Modelfrom keras import backend as Kdef build_small_model_on_batch():    def distribute_inputs_to_small_model(input):        return [small_model(input[:,i]) for i in range(N)]    def stacker(list_of_tensors):        return K.stack(list_of_tensors, axis=1)    input = Input(shape=(N,K,))    small_model_outputs = Lambda(distribute_inputs_to_small_model)(input)    stacked_small_model_outputs = Lambda(stacker)(small_model_outputs)    return Model(input, stacked_small_model_outputs)def build_combined():    input = Input(shape=(N,K,))    stacked_small_model_outputs = small_model_on_batch(input)    output = large_model(stacked_small_model_outputs)    return Model(input, output)small_model_on_batch = build_small_model_on_batch()combined = build_combined()
查看完整描述

1 回答

?
四季花海

TA贡献1811条经验 获得超5个赞

您可以使用TimeDistributed层包装器来做到这一点:


from keras.layers import Input, Dense, TimeDistributed

from keras.models import Sequential, Model


N = None  # Use fixed value if you do not want variable input size

K = 20


def small_model():

    inputs = Input(shape=(K,))

    # Define the small model

    # Here it is just a single dense layer

    outputs = Dense(K, activation='relu')(inputs)

    return Model(inputs=inputs, outputs=outputs)


def large_model():

    inputs = Input(shape=(N, K))

    # Define the large model

    # Just a single neuron here

    outputs = Dense(1, activation='relu')(inputs)

    return Model(inputs=inputs, outputs=outputs)


def combined_model():

    inputs = Input(shape=(N, K))

    # The TimeDistributed layer applies the given model

    # to every input across dimension 1 (N)

    small_model_out = TimeDistributed(small_model())(inputs)

    # Apply large model

    outputs = large_model()(small_model_out)

    return Model(inputs=inputs, outputs=outputs)


model = combined_model()

model.compile(loss='mean_squared_error', optimizer='sgd')

model.summary()

输出:


_________________________________________________________________

Layer (type)                 Output Shape              Param #

=================================================================

input_1 (InputLayer)         (None, None, 20)          0

_________________________________________________________________

time_distributed_1 (TimeDist (None, None, 20)          420

_________________________________________________________________

model_2 (Model)              (None, None, 1)           21

=================================================================

Total params: 441

Trainable params: 441

Non-trainable params: 0

_________________________________________________________________


查看完整回答
反对 回复 2021-07-13
  • 1 回答
  • 0 关注
  • 158 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信